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REMARKS 

Applicants have cancelled Claims 36, and 41-45 without prejudice to, or disclaimer of, 
the subject matter contained therein. Applicants maintain that the cancellation of a claim makes 
no admission as to its patentability and reserve the right to pursue the subject matter of the 
cancelled claim in this or any other patent application. 

Applicants have amended Claims 27-31 and 35 to remove reference to the figures. 
Applicants have amended claim 34 to read "a tag polypeptide". Applicants have amended Claim 
35 to add the limitation "wherein said isolated polypeptide is overexpressed in lung or colon 
tumors compared to normal lung or colon tissue, respectively, or wherein said isolated 
polypeptide is encoded by a nucleic acid which is amplified in lung or colon tumor compared to 
normal lung or colon tissue, respectively." Applicants have added new Claims 46 and 47. 
Applicants maintain that the amendments add no new matter and are fully supported by the 
specification as originally filed. For example, support for the amendment to Claim 34 can be 
found in the substitute specification on page 44, lines 0-6. Support for the amendment to Claim 
35 can be found in the substitute specification, for example, in Example 16, beginning on page 
108, line 20, particularly at lines 21-25. 

Claims 27-35, 37-40, and 46-47 are presented for examination. Applicants respond 
below to the specific rejections raised by the Examiner in the Office Action mailed September 7, 
2004. For the reasons set forth below, Applicants respectfully traverse. 

Rejection under 35 U.S.C. $101 - Utility 

The PTO has rejected the pending claims under 35 U.S.C. § 101 as lacking patentable 
utility. The PTO concedes that the cited utilities are credible. However, the PTO alleges that the 
invention lacks both substantial and specific utility. Applicants respectfully disagree. 

Substantial Utility 

The PTO argues that the invention lacks substantial utility because the level of 
overexpression in cancer cells of the nucleic acid which encodes the PR0539 protein was 
minimal, and there is no evidence that overexpression is significant or a real effect and not 
simply produced by chance. In addition, the PTO argues that the invention lacks utility because 
the overexpression of the nucleic acid is not relevant to the utility of the protein and there is no 
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evidence that the protein is overexpressed. The PTO cites three references to support its position 
that there is no necessary correlation between gene amplification, gene expression, and protein 
expression. The PTO concludes that because there is no necessary connection between the 
amount of DNA in a cell and the amount of mRNA, and no necessary connection between the 
level of protein in a cell and the amount of mRNA, any evidence of overexpression of one 
component does not provide utility for the protein. The PTO argues that the current situation 
closely tracks Example 12 of the Utility Guidelines, because where there is no necessary 
relationship between the protein levels or utilities and a small level of mRNA overexpression in 
cancer cells, the invention lacks any "real world" context of use for PR0539. 

Utility need NOT be Proved to an Absolute Certainty - a Correlation between the Evidence and 

the Asserted Utility is Sufficient 

Compliance with 35 U.S.C. § 101 is a question of fact. Raytheon v. Roper, 724 F.2d 951, 

956, 220 USPQ 592, 596 (Fed. Cir. 1983) cert, denied, 469 US 835 (1984). The evidentiary 

standard to be used throughout ex parte examination in setting forth a rejection is a 

preponderance of the evidence, or "more likely than not" standard. In re Oetiker, 977 F.2d 1443, 

1445, 24 USPQ2d 1443, 1444 (Fed. Cir. 1992). This is stated explicitly in the M.P.E.P.: 

[T]he applicant does not have to provide evidence sufficient to establish that an 
asserted utility is true "beyond a reasonable doubt." Nor must the applicant 
provide evidence such that it establishes an asserted utility as a matter of 
statistical certainty. Instead, evidence will be sufficient if, considered as a whole, 
it leads a person of ordinary skill in the art to conclude that the asserted utility is 
more likely than not . M.P.E.P. at § 2107.02, part VII (2004) (emphasis in 
original, internal citations omitted). 

The PTO has the initial burden to offer evidence "that one of ordinary skill in the art 
would reasonably doubt the asserted utility." In re Brana, 51 F.3d 1560, 1566, 34 U.S.P.Q.2d 
1436 (Fed. Cir. 1995). Only then does the burden shift to the Applicant to provide rebuttal 
evidence. Id, As stated in the M.P.E.P., such rebuttal evidence does not need to absolutely prove 
that the asserted utility is real. Rather, the evidence only needs to be reasonably indicative of the 
asserted utility. 

JnFujikawa v. Wattanasin, 93 F.3d 1559, 39 U.S.P.Q.2d 1895 (Fed. Cir. 1996), the Court 

of Appeals for the Federal Circuit upheld a PTO decision that in vitro testing of a novel 

pharmaceutical compound was sufficient to establish practical utility, stating the following rule: 
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[T]esting is often required to establish practical utility. But the test results need 
not absolutely prove that the compound is pharmacologically active. All that is • 
required is that the tests be "reasonably indicative of the desired 
[pharmacological] response." In other words, there must be a sufficient 
correlation between the tests and an asserted pharmacological activity so as to 
convince those skilled in the art, to a reasonable probability, that the novel 
compound will exhibit the asserted pharmacological behavior." Fujikawa v. 
Wattanasin, 93 F.3d 1559, 1564, 39 U.S.P.Q.2d 1895 (Fed. Cir. 1996) (internal 
citations omitted, bold emphasis added, italics in original). 

While the Fujikawa case was in the context of utility for pharmaceutical compounds, the 
principals stated by the Court are applicable in the instant case where the asserted utility is for a 
therapeutic and diagnostic use - utility does not have to be established to an absolute certainty, 
rather, the evidence must convince a person of skill in the art "to a reasonable probability." In 
addition, the evidence need not be direct, so long as there is a "sufficient correlation" between 
the tests performed and the asserted utility. 

The Court in Fujikawa relied in part on its decision in Cross v. Ilzuka, 753 F.2d 1040, 

224 U.S.P.Q. 739 (Fed. Cir. 1985). In Cross, the Appellant argued that basic in vitro tests 

conducted in cellular fractions did not establish a practical utility for the claimed compounds. 

Appellant argued that more sophisticated in vitro tests using intact cells, or in vivo tests, were 

necessary to establish a practical utility. The Court in Cross rejected this argument, instead 

favoring the argument of the Appellee: 

[I\n vitro results... are generally predictive of in vivo test results, i.e., there is a 
reasonable correlation therebetween. Were this not so, the testing procedures of 
the pharmaceutical industry would not be as they are. [Appellee] has not urged, 
and rightly so, that there is an invariable exact correlation between in vitro test 
results and in vivo test results. Rather, [Appellee's] position is that successful in 
vitro testing for a particular pharmacological activity establishes a significant 
probability that in vivo testing for this particular pharmacological activity will be 
successful. Cross v. Ilzuka, 753 F.2d 1040, 1050, 224 U.S.P.Q. 739 (Fed. Cir. 
1985) (emphasis added). 

The Cross case is very similar to the present case. Like in vitro testing in the 
pharmaceutical industry, those of skill in the field of biotechnology rely on the reasonable 
correlation that exists between gene expression and protein expression (see below). Were there 
no reasonable correlation between the two, the techniques that measure gene levels such as 
microarray analysis, differential display, and quantitative PCR would not be so widely used by 
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those in the art. As in Cross, Applicants here do not argue that there is "an invariable exact 
correlation" between gene expression and protein expression. Instead, Applicants' position 
detailed below is that a measured increase in gene expression or gene amplification in cancer 
cells establishes a "significant probability" that the encoded polypeptide will also be 
overexpressed in cancer based on "a reasonable correlation therebetween". 

Taken together, the legal standard for demonstrating utility is a relatively low hurdle. An 
Applicant need only provide evidence such that it is more likely than not that a person of skill 
in the art would be convinced, to a reasonable probability, that the asserted utility is true. 
The evidence need not be direct evidence, so long as there is a reasonable correlation between the 
evidence and the asserted utility. The standard is not absolute certainty. 

Even assuming that the PTO has met its initial burden to offer evidence that one of 
ordinary skill in the art would reasonably doubt the truth of the asserted utility, Applicants assert 
that they have met their burden of providing rebuttal evidence such that it is more likely than not 
those skilled in the art, to a reasonable probability, would believe that the PR0539 polypeptide is 
useful as a diagnostic tool for cancer. 

Applicants have established that the Gene Encoding the PRQ539 Polypeptide is Amplified in 
Lung and Colon Tumors compared to Normal Tissue and is Useful as a Diagnostic Tool 

Applicants first address the PTO's argument that the level of overexpression of nucleic 
acid encoding PR0539 was minimal and insignificant. Applicants submit that the gene 
amplification data provided in the present application are sufficient to establish a specific and 
substantial utility for the gene encoding the PR0539 polypeptide, as well as the PR0539 
polypeptide. 

Applicants previously submitted the declaration of Dr. Audrey Goddard with exhibits A- 
G. In her declaration, Dr. Goddard states that a 2-fold increase in gene copy number, i.e., a ACt 
value of 1, is "significant and useful" in detecting cancerous tumors or the diagnosis of cancer. 
Goddard Declaration, paragraph 7. The nucleic acid encoding the PR0539 polypeptide has a 
value of 1 or greater in several tumor samples tested. Thus, the differential expression of the 
nucleic acid encoding PR0539 can be used to distinguish cancerous tissue from normal tissue. 

In the present Office Action, the PTO hasnot offered any reason to reject Dr. Goddard's 

declaration. Applicants remind the PTO that the applicant need not provide evidence such that it 

> 
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establishes an asserted utility "as a matter of statistical certainty." M.P.E.P. at § 2107.02, part 
VII (2004). Applicants therefore submit that they have established that the gene amplification 
data reported in Example 16 are significant, and the utility for the PR0539 DNA in 
distinguishing between normal and cancerous tissue has been established. For the reasons 
discussed below, this leads to utility for the PR0539 polypeptides as well. 

Applicants have established that the Accepted Understanding in the Art is that there is a 
Reasonable Correlation between Gene Amplification and Overexpression of the Encoded Protein 

Applicants next address the PTO's argument that the invention lacks utility because the 
overexpression of the nucleic acid is not relevant to the utility of the protein, and there is no 
evidence that the protein is overexpressed. The PTO cites Pennica et al (Proc. Natl. Acad. Sci. 
(1998) 95:14717-14722) for the proposition that there is no necessary connection between the 
amount of DNA in a cell and the amount of mRNA in a cell. The PTO also cites Meric et al 
(Molecular Cancer Therapeutics (2002) 1:971-79) and Gokman-Polar (Cancer Research (2001) 
61:1375-81) to support its position that there is no necessary correlation between mRNA levels 
and protein levels. The PTO concludes that because there is no necessary connection between 
gene amplification and mRNA, and between mRNA and protein, any evidence of overexpression 
of one component does not provide utility for the protein. 

As discussed above, evidence of utility does not have to be to an absolute certainty, and 
therefore there does not need to be a necessary connection between gene amplification and 
protein expression. Rather, there need only be a reasonable correlation between the evidence 
offered and the asserted utility such that it is more likely than not that a person of skill in the art 
would be convinced, to a reasonable probability, that the asserted utility is true. 

Applicants submit that those of skill in the art would recognize that there is a reasonable 
correlation between amplification of a gene and an increase in gene expression. This assertion is 
supported by numerous references. Orntoft et al {Molecular and Cellular Proteomics, 1:37-45 
(2002); submitted herewith as Exhibit 1) studied transcript levels of 5600 genes in malignant 
bladder cancers which were linked to a gain/loss of chromosomal material using an array-based 
method. Orntoft et al showed that there was a gene dosage effect and teach that "in general (18 
of 23 cases) chromosomal areas with more than 2-fold gain of DNA showed a corresponding 
increase in mRNA transcripts." Orntoft at 37, column 1, abstract. In addition, Hyman et al 
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(Cancer Research, 62:6240-6245 (2002); submitted herewith as Exhibit 2) used CGH analysis 
and cDNA microarrays to compare DNA copy numbers and mRNA expression of over 12,000 
genes in breast cancer tumors and cell lines. They showed that there is "evidence of a prominent 
global influence of copy number changes on gene expression levels." Hyman at 6244, column 1, 
last paragraph. 

Additional supportive teachings are also provided by Pollack et al. (PNAS, 99:12963- 
12968 (2002); submitted herewith as Exhibit 3) who studied a series of primary human breast 
tumors and found that "[b]y analyzing mRNA levels in parallel, we have also discovered that 
changes in DNA copy number have a large, pervasive, direct effect on global gene expression 
patterns in both breast cancer cell lines and tumors." Pollack at 12967 at column 1, emphasis 
added. Their study found that "62% of highly amplified genes show moderately or highly 
elevated expression, that DNA copy number influences gene expression across a wide range of 
DNA copy number alterations (deletion, low-, mid- and high-level amplification), that on 
average, a 2-fold change in DNA copy number is associated with a corresponding 1.5-fold 
change in mRNA levels." (Pollack at 12963, column 1, abstract). 

Together, these articles collectively teach that it is more likely than not that gene 
amplification increases mRNA expression. This evidence establishes that there is a reasonable 
correlation between gene amplification and gene expression, and one of skill in the art would 
believe, to a reasonable probability, that gene amplification would lead to increased gene 
expression. 

Relying on a single contrary example of one gene, the PTO states that the literature 
reports that it does not necessarily follow that an increase in gene copy number results in 
increased gene expression and increased polypeptide expression. The PTO focuses on a 
statement from the abstract of Pennica that the WISP-2 gene DNA was amplified in colon 
tumors, but RNA expression was reduced. Pennica at 14717. This inverse correlation is in 
contrast to the WISP-1 gene, which was amplified and had higher RNA levels. The authors of 
Pennica offer an explanation for what they obviously viewed as an anomalous result: "Because 
the center of the 20ql3 amplicon [of which WISP-2 is a part] has not yet been identified, it is 
possible that the apparent amplification observed for WISP-2 may be caused by another gene in 
this amplicon." Id at 14722, emphasis added. Thus, the example of a lack of positive 
correlation between gene amplification and RNA levels relied on by the PTO may not even be 

-9- 



. Appl. No, : 10/032,996 

Filed : December 27, 2001 

real. The fact that the authors attempt to explain this anomaly only supports Applicants' 
argument that the accepted understanding in the art is that there is a direct correlation between 
gene amplification and an increase in gene expression. 

As stated above, the standard for utility is not absolute certainty, but rather whether one 
of skill in the art would be more likely than not to believe the asserted utility. Even if Pennica 
supported the PTO's argument, which it does not, one contrary example is not sufficient to prove 
that a person of skill in the art would have a reasonable doubt that gene amplification is not" 
correlated to gene expression. Given the evidence provided by the Applicants which establishes 
that there is a reasonable correlation between gene amplification and mRNA expression, one of 
skill in the art would believe, to a reasonable probability, that the reported amplification of the 
PR0539 gene would lead to an increase in the level of PR0539 mRNA. 

Applicants next address the PTO's argument that there is no necessary correlation 
between mRNA levels and protein levels. 

Applicants have previously submitted a copy of a Declaration by J. Christopher Grimaldi, 
an expert in the field of cancer biology. As stated in paragraph 5 of the declaration, "Those who 
work in this field are well aware that in the vast majority of cases, when a gene is over- 
expressed... the gene product or polypeptide will also be overexpressed." Similarly, the 
previously submitted declaration of Paul Polakis, Ph.D., an expert in the field of cancer biology 
states that "it remains a central dogma in molecular biology that increased mRNA levels are 
predictive of corresponding increased levels of the encoded proteia" Polakis Declaration, 
paragraph 6. He cites as supporting evidence not only his years of personal experience, but also 
results from experiments related to the present application. He reports that for the mRNAs 
overexpressed in cancer that have been examined, 80% had correspondingly higher levels of the 
encoded protein. Polakis Declaration at paragraphs 4 and 5. 

The statements of Grimaldi and Polakis are supported by the teachings in Molecular 
Biology of the Cell, a leading textbook in the field (Bruce Alberts, et ai, Molecular Biology of 
the Cell (4 th ed. 2002) submitted herewith as Exhibit 4). Figure 6-3 on page 302 illustrates the 
basic principle that there is a correlation between increased gene expression and increased 
protein expression. The accompanying text states that "a cell can change (or regulate) the 
expression of each of its genes according to the needs of the moment - most obviously by 
controlling the production of its mRNA" Molecular Biology of the Cell at 302, emphasis added. 
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Similarly, figure 6-90 on page 364 illustrates the path from gene to protein. The accompanying 
text states that while potentially each step can be regulated by the cell, "the initiation of 
transcription is the most common point for a cell to regulate the expression of each of its genes." 
Molecular Biology of the Cell at 364. This point is repeated on page 379, where the authors state 
that of all the possible points for regulating protein expression, "[f]or most genes transcriptional 
controls are paramount." Molecular Biology of the Cell at 379. 

Together, the declarations of Grimaldi and Polakis, the accompanying references, and the 
excerpts from the Molecular Biology of the Cell all establish that the accepted understanding in 
the art is that there is a reasonable correlation between gene expression and the level of the 
encoded protein. Applicants have demonstrated the increased expression of the gene encoding 
PR0539, and have provided sufficient evidence to show that there is a reasonable correlation 
between expression of the gene and the level of PR0539 protein. 

In arguing against this assertion, the PTO cites two references. The PTO relies on a 
statement from Gokman-Polar that "PKC mRNA levels do not directly correlate with PKC 
protein levels." Office Action at 3-4. However, a close review of the entire article indicates that 
with one exception, the trend in the data is that mRNA and protein levels are positively 
correlated, supporting Applicants assertion. In Figure 2, the protein level of two isozymes shows 
a decrease, while the third is increased. This same pattern is seen for the corresponding mRNA 
levels in Figure 6, although admittedly the increase in mRNA for the third isozyme is minimal. 
Similarly, comparing the protein levels of the three isozymes in Figure 4 to the corresponding 
mRNA levels in Figure 7, with one exception the mRNA levels are positively correlated to 
protein levels. While protein levels do not increase or decrease in direct proportion to the 
changes in mRNA, the trend in five of the six examples is that protein levels are positively 
correlated to mRNA levels. This evidence is hardly sufficient to establish that one of skill in the 
art would reasonably doubt that there is a reasonable correlation between mRNA levels and 
protein levels. 

The Meric article cited by the PTO offers even less support for the PTO's position. The 
PTO relies on the statement that "Gene expression is quite complicated, however, and is also 
regulated at the level of mRNA stability, mRNA translation, and protein stability." Office 
Action at 3. What the PTO ignores is the preceding statement by the authors: 
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The fundamental principle of molecular therapeutics in cancer is to exploit the 
differences in gene expression between cancer cells and normal cells... [M]ost 
efforts have concentrated on identifying differences in gene expression at the level 
of mRNA, which can be attributable to either DNA amplification or to differences 
in transcription. Meric at 971 (emphasis added). 

This statement supports Applicants* asserted utility. It is true that there is no necessary 
correlation between gene expression and protein expression because there are other mechanism 
for regulating gene expression. However, were there no significant correlation between gene 
expression and protein levels, exploiting differences in gene expression between cancer cells and 
normal cells would not be a "fundamental principle of molecular therapeutics in cancer." 

In light of the lack of significant support for the PTO's argument, Applicants submit that 
the PTO has failed to establish a reason for one of skill in the art to doubt the asserted utility. 
Even if it has, Applicants have offered sufficient evidence to rebut the PTO's argument and 
establish that there is a reasonable correlation between gene amplification, gene expression, and 
protein expression. The PTO is again reminded that absolute predictability is not required. 
Applicants have established that it is more likely than not that one of skill in the art would be 
convinced, to a reasonable probability, that the PR0539 protein is overexpressed in certain 
cancers, and therefore has utility as a diagnostic tool. 

The Instant Case Differs Significantly from Example 12 of the Utility Guidelines 

Applicants next address the PTO's argument that the current situation closely tracks 
Example 12 of the Utility Guidelines, because where there is no necessary relationship between 
the protein levels or utilities and a small level of mRNA overexpression in cancer cells, the 
invention lacks any "real world" context of use for PR0539. 

In Example 12, the specification discloses a protein, receptor A, which is the binding 
partner for protein X. The specification does not characterize the isolated protein with regard to 
its biological function or any disease or body condition that is associated with the isolated 
protein. In addition, the function of protein X has also not been identified. One of the asserted 
utilities for receptor A is making monoclonal antibodies to receptor A which can be used as a 
therapeutic drug to effect control over the receptor. In the analysis of this asserted utility for 
receptor A, the Utility Guidelines state that "since neither the specification nor the art of record 
disclose any diseases or conditions associated with receptor A, the asserted utility in this case 
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essentially is a method of treating an unspecified, undisclosed disease or condition, which does 
not define a 'real world' context of use." Utility Guidelines at 66, emphasis added. 

The situation in Example 12 is not the situation here. Applicants have demonstrated that 
the nucleic acid encoding PR0539 is amplified in certain cancers. Thus, unlike the protein in 
Example 12, PR0539 is associated with a disease or condition - more specifically, lung and 
colon cancer. 

The PTO asserts that because it is the nucleic acid, and not the PR0539 polypeptide 
which has been shown to be amplified in cancer cells, the PR0539 polypeptide is not associated 
with any disease. However, as discussed at length above, Applicants have demonstrated a 
reasonable correlation between gene amplification and protein expression such that one of skill 
in the art would believe, to a reasonable probability, that the PR0539 protein is overexpressed in 
certain cancers and is therefore useful as a diagnostic tool. 

The present situation closely resembles the caveat discussed at the end of Example 12, 
where receptor A is shown to be present on the cell membranes of melanoma cells but not on the 
cell membranes of normal skin cells. The Utility Guidelines state that in that situation, "making 
a monoclonal antibody to receptor A for diagnosing melanoma would constitute a well- 
established utility." Utility Guidelines at 70. Similarly, here Applicants have provided evidence 
that it is more likely than not that PR0539 is expressed at higher levels in certain cancer cells 
than normal tissue, and therefore it can be used to make diagnostic antibodies. 

The PTO 's Response to Applicants ' Expert Declarations and Arguments is Insufficient to reject 
the Applicants ' Asserted Utility 

The PTO's rejection of Applicants' expert declarations as "fundamentally flawed" 
because they fail to provide specific evidence regarding PR0539 is unwarranted. As discussed 
above, specific evidence of overexpression of PR0539 in cancer is not required. Instead, indirect 
evidence of the asserted utility of the PR0539 polypeptide can be offered so long as there is a 
"reasonable correlation" between the proffered evidence and asserted utility, such that it is more 
likely than not that a person of skill in the art would believe the asserted utility. The Applicants' 
expert declarations establish a reasonable correlation between gene amplification and protein 
expression, and in view of Applicants' data showing amplification of the gene encoding PR0539 
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in lung and colon cancer, thus support the asserted utility of the PR0539 polypeptides as a 
diagnostic tool for cancer. 

Similarly, the PTO's reliance on In re Kirk, 376 F.2d 936, 153 U.S.P.Q. 48 (C.C.P.A. 
1967) is also misplaced. In Kirk, the asserted utility for the claimed compounds was "a new class 
of compounds often possessing high biological activity" and "intermediates in the preparation of 
compounds with valuable biological properties. . .". Id. at 1 120, 1 121. The Court rejected these 
statements as "nebulous expressions" of usefulness. Id. at 1 124. 

Here, Applicants have asserted a much more specific utility than "biological activity" or 
"biological properties". Applicants have provided evidence of amplification of the gene 
encoding PR0539 in certain cancers and have shown that this evidence is reasonably correlated 
to overexpression of the PR0539 polypeptide in those same cancers, namely, lung and colon 
cancer. As Example 12 of the Utility Guidelines make clear, when a protein is differentially 
expressed in cancer compared to normal tissue, the protein has utility in making antibodies which 
can be used to diagnose the cancer. This is the situation here. 

Specific Utility 

The PTO argues that even if substantial utility were found, there is no specific utility 
given for the PR0539 protein, since the protein, as distinguished from the nucleic acid, has not 
been associated with any disease, condition, or any other specific feature. Relying on the lack of 
correlation between levels of nucleic acid and protein cited in the Gokman-Polar and Meric 
references, the PTO argues that the overexpression of the nucleic acid gives no specific utility 
because it is entirely unrelated to uses of the protein. Applicants respectfully disagree. 

Specific Utility is defined as utility which is "specific to the subject matter claimed," in 
contrast to "a general utility that would be applicable to the broad class of the invention." 
M.P.E.P. § 2107.01, part I (2004). Applicants submit that the evidence of overexpression of 
PR0539 nucleic acids in certain types of cancer cells along with the declarations and references 
discussed above provide a specific utility for the claimed proteins. As stated above, Applicants 
have established a reasonable correlation between gene amplification and protein expression. 
This makes the PR0539 protein useful in diagnosing lung and colon cancer. This is not a 
general utility that would apply to the broad class of proteins. 
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The amplification of PR0539 nucleic acid in certain cancer cells distinguishes this case 
from Examples 4 and 12 of the Utility Guidelines cited by the PTO. In both examples, there is 
no description of the protein beyond its sequence or its binding of an unidentified ligand. Here, 
the disclosed proteins are encoded by a nucleic acid that is amplified in certain cancer cells, 
which is reasonably correlated to overexpression of the PR0539 polypeptide. This makes the 
utility of using the protein to diagnose lung and colon cancer specific, since in general, proteins 
are not overexpressed in cancer cells. 

The PTO's previous response to Applicants' arguments regarding specific utility is 
lacking. The PTO asserts that because Applicants' arguments presume the PR0539 protein is 
overexpressed, and this is not necessarily the case, this cannot serve as the foundation to support 
specific utility. However, utility need not be established "beyond a reasonable doubt" or to a 
"statistical certainty." Rather, Applicants need only establish that the asserted utility is "more 
likely than not." M.P.E.P. at § 2107.02, part VII (2004). Thus, it need not be shown that 
overexpression of PR0539 polypeptide is necessarily the case, only that it is more likely than 
not, which Applicants have done. 

Conclusion 

Given the totality of the evidence provided, Applicants submit that they have established 
a credible, substantial, and specific utility for the claimed polypeptides as diagnostic tools. 
According to the M.P.E.P. and case-law cited above, irrefutable proof of a claimed utility is not 
required. Rather, a specific and substantial credible utility requires only a "reasonable" 
confirmation of a real world context of use. Applicants have offered sufficient evidence to 
establish that there is a reasonable correlation between gene amplification, gene expression, and 
protein expression. Applicants have established that it is more likely than not that one of skill in 
the art would be convinced, to a reasonable probability, that based on the gene amplification data 
for the gene encoding PR0539, the PR0539 protein is overexpressed in certain cancers, and 
therefore has utility as a diagnostic tool. In view of the above, Applicants respectfully request 
that the PTO reconsider and withdraw the utility rejection under 35 U.S.C. §101 . 
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Rejection under 35 U.S.C §112 - Written Description 

The PTO has rejected Claims 35-45 under 35 U.S.C. § 112, first paragraph, as containing 
subject matter which was not described in the specification in such a way as to reasonably convey 
to one skilled in the art that the inventors, at the time the application was filed, had possession of 
the claimed invention. The PTO states that "the compound is claimed solely [by] its protein 
sequence related 80%-99% SEQ ID NO: 7 without any correlative function to delimit the 
structure." The Office Action at page 9. The PTO argues that at the time of filing there is no 
record or description which would demonstrate conception of any proteins other than those 
expressly disclosed which comprise SEQ ID NO: 7. 

The Legal Standard for Written Description 

The well-established test for sufficiency of support under the written description 
requirement of 35 U.S.C. §112, first paragraph is whether the disclosure "reasonably conveys to 
artisan that the inventor had possession at that time of the later claimed subject matter." In re 
Kaslow, 707 F.2d 1366, 1375, 2121 USPQ 1089, 1096 (Fed. Cir. 1983); see also Vas-Cath, Inc. 
v. Mahurkar, 935 F.2d atl563, 19 USPQ2d at 1116 (Fed. Cir. 1991). The adequacy of written 
description support is a factual issue and is to be determined on a case-by-case basis. See e.g., 
Vas-Cath, Inc. v. Mahurkar, 935 F.2d at 1563, 19 USPQ2d at 1116 (Fed. Cir. 1991). The factual 
determination in a written description analysis depends on the nature of the invention and the 
amount of knowledge imparted to those skilled in the art by the disclosure. Union Oil v. Atlantic 
Richfield Co., 208 F.3d 989, 996 (Fed. Cir. 2000). 

The Current Invention is Adequately Described 

As noted above, whether the Applicants were in possession of the invention as of the 
effective filing date of an application is a factual determination, reached by the consideration of a 
number of factors, including the level of knowledge and skill in the art, and the teaching 
provided by the specification. The inventor is not required to describe every single detail of 
his/her invention. An Applicant's disclosure obligation varies according to the art to which the 
invention pertains. 

The present invention pertains to the field of recombinant DNA/protein technology. It is 
well established that the level of skill in this field is very high since a representative person of 
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skill is generally a Ph.D. scientist with several years of experience. Accordingly, the teaching 
imparted in the specification must be evaluated through the eyes of a highly skilled artisan as of 
the date the invention was made. The subject matter of the rejected claims concerns polypeptides 
having a sequence identity of 95-99% with the specified polypeptide sequence of SEQ ID NO: 7, 
and as amended, with the functional recitation: "wherein said isolated polypeptide is 
overexpressed in lung or colon tumors compared to normal lung or colon tissue, respectively, or 
wherein said isolated polypeptide is encoded by a nucleic acid which is amplified in lung or 
colon tumor compared to normal lung or colon tissue, respectively." Based on the detailed 
description of the cloning and expression of variants of PR0539 in the specification, the 
description of the gene amplification assay, the actual reduction to practice of sequences SEQ ID 
NOs: 6 and 7, and the functional recitation in the instant claims, Applicants submit that one- of 
skill in the art would know that Applicants possessed the subject matter of the pending claims. 
Hence, Applicants respectfully request that the PTO reconsider and withdraw the written 
description rejection under 35 U.S.C. §112. 

Rejection under 35 U.S.C. §112 - Enablement 

The PTO rejected Claims 27-45 under 35 U.S.C. § 112, first paragraph, as containing 
subject matter which was not described in the specification in such a way as to enable one skilled 
in the art to make and/or use the invention. The PTO cites In re Wands and the factors set forth 
therein to determine the scope of enablement. However, Applicants respectfully submit that the 
PTO's conclusions are inconsistent with the teachings of Wands, as they rest on the erroneous 
assumption that a necessary connection between gene amplification and protein expression is 
required. The PTO states that "[w]ith regard to enablement, fundamentally the same arguments 
[as those given for utility] apply, and this rejection is maintained for the same reasons as given 
above in response to the arguments on utility." Office Action at 18. 

The Applicants believe that the evidence, declarations, references, and arguments 
discussed above make clear that Applicants have established that it is more likely than not that 
one of skill in the art would be convinced, to a reasonable probability, that the PR0539 protein is 
overexpressed in certain cancers, and therefore has utility as a diagnostic tool. This would 
include the use of the claimed polypeptides to create diagnostic and therapeutic antibodies. This 
use is disclosed in the application, for example at page 88, lines 4-5 of the substitute 
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specification, and the techniques for the creation of antibodies are well known and routine in the 
art. Thus, at least one use of PR0539 polypeptides is adequately enabled, which is all that is 
required - "if any use is enabled when multiple uses are disclosed, the application is enabling for 
the claimed invention." M.P.E.P. 2164.01(c). In view of the above, Applicants respectfully 
request that the Examiner reconsider and withdraw the enablement rejection under 35 U.S.C. 
§ 1 12, first paragraph. 



CONCLUSION 

In view of the above, Applicants respectfully maintain that claims are patentable and 
request that they be passed to issue. Applicants invite the Examiner to call the undersigned if any 
remaining issues may be resolved by telephone. 

Please charge any additional fees, including any fees for additional extension of time, or 
credit overpayment to Deposit Account No. 1 1-1410. 

Respectfully submitted, 

KNOBBE, MARTENS, OLSON & BEAR, LLP 



Dated 



leMarie Kaiser 
Registration Nckj. 
Attorney of Record 
Custodier No. 30,313 
(619) 235-8550 
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Genome-wide Study of Gene Copy Numbers, 
Transcripts, and Protein Levels in Pairs of 
Non-invasive and Invasive Human Transitional 
Cell Carcinomas* 

■S^/*???S^ Th ^ mas ThykiaeiH, Frederic M. Waldman||, Hans Wolf**, 
and Julio E. Celis£t 



Gain and loss of chromosomal material is characteristic 
of bladder cancer, as well as malignant transformation in 
general The consequences of these changes at both the 
transcription and translation levels is at present unknown 
partly because of technical limitations. Here we have at- 
tempted to address this question In pairs of non-Invasive 
and invasive human bladder tumors using a combination 
of technology that included comparative genomic hybrid- 
ization, high density oligonucleotide array-based monitor- 
ing of transcript levels (5600 genes), and high resolution 



phenomenon at both the transcription and translation levels. 
High throughput array studies of the breast cancer cell line 
BT474 has suggested that there is a correlation between 
DNA copy numbers and gene expression inhighly amplified 
areas (2), and studies of Individual genes In solid tumors 
have revealed a good correlation between gene dose and 
mRNA or protein levels In the case of c-erb-B2 t cyctfn dl t 
emsl, and N-myc (3-5). However, a high cyclln D1 protein 
expression has been observed without simultaneous am- 



- • — : » r- #» mvjM iooviuuuu • - " w " w ^» ***** wtuwMi oimuilcUttJUUS &m 

two-dimensional gel electrophoresis/lTie results showed^^^^ion (4), and a low level of c-myc copy number in 
that there is a nana risvcan* ******** c~ . ■ ... 



that there is a gene dosage effect W in some cases' 
superimposes on other regulatory mechanisms. This ef- 
fect depended (p < 0,015} on the magnitude of the com- 
parative genomic hybridization change. In general (16 of 
23 cases), chromosomal areas with more than 2-fold gain 
Of DNA showed a corresponding Increase In mRNA tran- 
scripts. Areas with loss of DNA, on the other hand, 
showed either reduced or unaltered transcript level^) Be- 
cause most proteins resolved by two-dimensional gels 
are unknown It was only possible to compare mRNA and 
protein alterations in relatively few cases of well focused 
abundant proteins, fefith few exceptions we found a good 
correlation (p < 0.005) between transcript alterations and 
protein levels. The Implications, as wen as limitations, 
of the approach are discussed. Molecular 6 Cellular 
Protoomlcs 1&7-4S, 2002. 



, Aneuploidy is a common feature of most human cancers 
(1), but little is known about the genome-wide effect of this 



From the ^Department of Clinical Biochemistry, Molecular Diag- 
nostic Laboratory and ^Department of Urology, Aarhus University 
Hospital, Skejby, DK-8200 Aarhus N, Denmark, IAROS Applied Bio- 
technology ApS, Gustav WiedsveJ 10, DK-6000 Aarhus C f Denmark, 
IUCSF Cancer Center and Department of Laboratory Medicine, Uni- 
versity of California. Sain Francisco, CA 94143-0808, and institute 
of Medical Biochemistry and Danish Centre for Human Genome Re- 
search, de Worms AH6 170, Aarhus University, DK-8000 Aarhus C 
Denmark 

Received, September 26. 2001, and in revised form, November 7, 
2001 

Published, MCP Papers in Press, November 13, 2001 DO! 
1 0.1 074/mcp.M1 0001 9-MCP200 



crease was observed without concomitant c-myc protein 
overexpression (6), 

In human bladder tumors, karyotyping, fluorescent In situ 
hybridization, and comparative genomic hybridization (CGH) 1 
have reveaied chromosomal aberrations that seem to be 
characteristic of certain stages of disease progression. In the 
case of noninvasive pTa transitional cell carcinomas (TCCs), 
this Includes loss of chromosome 9 or parts of it, as well as 
loss of Y In males- In minimally Invasive pT1 TCCs, the fol- 
lowing alterations have been reported: 2q~, 11p~, lq+, 
11q13+. 17q+, and 20q+ (7-12). It has been suggested that 
these regions harbor tumor suppressor genes and onco- 
genes; however, the large chromosomal areas Involved often 
contain many genes, making meaningful predictions of the 
functional consequences of losses and gains very difficult. 

In this Investigation we have combined gehome-wide tech- 
nology for detecting genomic gains and losses (CGH) with 
gene expression profiling techniques (mlcroarrays and pro- 
teomlcs) to determine the effect of gene copy number on 
transcript and protein levels In pairs of noninvasive and In- 
vasive human bladder TCCs, 

EXPERIMENTAL PROCEDURES 

Mafe/fa/~Bladder tumor biopsies were sampled after Informed 
consent was obtained and after removal of tissue for routine pathol- 
ogy examination. By light microscopy tumors 335 and 532 were 
staged by an experienced pathologist as pTa (superficial papillary) 



The abbreviations used are: CGH, comparative genomic hybrid- 
ization; TCC, transitional cell carcinoma; L0H, toss of heterozygosity- 
PA-FABP, psoriasis-associated fatty acid-binding protein: 2D 
two-dimensional. ' 
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orado I and II, respectively, tumors 733 and 827 were staged as pT1 
Orwasfve Into submucosa), 733 was staged as solid, and 827 was 
staged as papillary, both grade III. 

mBNAPrBpamtlon^J^sue biopsies, obtained (rash from surgery, 
worn embedded Immediately In a sodium-guanlainlum thlocyanate 
a ? Tt* 0 ™** * To ^ RNA ^ Elated using the 

te0,ation method »VAK-Chemte Medical GMBH). 

£^J?t* ^ teolaled * 30 Section step (Ollgotex 

mRNA kit; Qiagen). 

c«/V>\ Preparation^ fig of mRNA was used as starting materia! 
The first and second strand cDNA synthesis was performed using the 
Superscript® choice system Pnvitrogen) according to the manufac- 
turer's Instructions but using an oBgo(dT) primer containing a T7 RNA 
polymery binding site. Labeled cRNA was prepared using the ME- 
GAscrip® in vitro transcription kit (Ambion). Biotln-labeled CTP and 



UTP (Enzo) was used, together with unlabeled NTPs In the reaction. 
Following the In vitro transcription reaction, the unincorporated nu- 
cleotides were removed using RNeasy columns (CHagen). 

Array Hybridization and Scanning-Array hybridization and scan* 
ning was modified from a previous method (13). 10 m of cRNA was 
fragmented at 94 °C for 35 min In buffer containing 40 mM Tris 
acetate, pH 8.1, 100 mM KOAc. 30 mM MgOAc. Prior to hybridization 
the fragmented cRNA in a 6x SSPE-T hybridization buffer (1 m NaCI* 
10 mM Trfs, pH 7.6, 0.005% Triton), was heated to 95 °C for 5 mrn* 
subsequently cooled to 40 °C, and loaded onto the Affymetrix probe 
array cartridge. The probe array was then incubated for 16 h at 40 P C 
at constant rotation (60 rpm). The probe array was exposed to 10 
washes In 6x SSPE-T at 25 *C followed by 4 washes in OSx SSPE-T 
at 50 *C. The biotJnyiated cRNA was, stained, with a streptavldin- 
phycoerythrin conjugate, 10 /*g/ml (Molecular Probes) in 6x SSPE-T 
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Fig. I—continued 



for 30 mln at 25 °C followed by 10 washes in 6x SSPE-T at 25 °C. The 
probe arrays were scanned at 560 nm using a conf oca! laser scanning 
microscope (made for Affymetnx by Hewlett-Packard). The readings 
from the quantitative scanning were analyzed by Asymetrix gene 
expression analysis software. 

MfcrosateWte Ana!y&-M\crosatenne Analysis was performed as 
described previously (14). Microsateliites were selected by use of 
wwwjKbiJilm.nm.gov/genemap98, and primer sequences were ob- 
ta&ied from the genome data base at www.gol>.oig.Drw was extracted 
from tumor and Wood and amplified by PCR In a volume of 20 jrf for 35 
cycles. The ejnpllcons were denatured and electropho^ 
ABI Prism 377, Data were collected to the Gene Scan program for 
fragment analysis. Loss of heterozygosity was defined re 
of one allele detected In tumor arrtpflcons compared with blood. 

Prvteomic Anafysfs-TCCs were minced Into email pieces and 
homogenized In a small glass homogenlzer In 0.5 ml of lysis solution. 
Samples were stored at -20 °C until use. The procedure for 2D get 
electrophoresis has been described In detail elsewhere (1 5, 16). Gels 
were stained with silver nitrate and/or Coorhassie Brffflant Blue. Pro- 
teins were Identified by a combination of procedures that included 
microsequencing, mass spectrometry, two-dimensional gel Western 
immunoblotting. and comparison with the master two-dimensional gel 
Image of human keratlnocyte proteins; see biobase,dk/cgi4>in/cefls. 

CGH- Hybridization of differentially labeled tumor and normal ONA 
to normal metaphase chromosomes was performed as described 
previously (10). Ruorescein-labeled tumor DNA (200 ng), Texas Red- 



labeled reference DNA (200 ng), and human Cot-1 DNA (20 fig) were 
denatured at 37 °C for 5 mln and applied to denatured normal met- 
aphase slides. Hybridization was at 37 °C for 2 days. After washing, 
the slides were counterstained with 0.15 fig/ml 4,6-oTamkftno-2-phe- 
nylindote In an arm-fade solution. A second hybridization was per- 
formed for an tumor samples using fluorescein-tebeted reference DNA 
and Texas Red-labeled tumor DNA (inverse labeling) to confirm the 
aberrations detected during the initial hybridization. Each CGH ex- 
periment also Included a normal control hybridization using fluores- 
cein- and Texas Red-labeled normal DNA. Digital image analysis was 
used to Identify chromosomal regions with abnormal fluorescence 
ratios, indicating; regions of DNA gains and losses. The average 
greercred fluorescence intensity ratio profiles were calculated using 
four Images of each chromosome (eight chromosomes total) with 
normalization of the greerured fluorescence intensity ratio for the 
entire metaphase and. background connection. Chromosome identifi- 
cation was performed based on 4,6-dlamldino-2-phenynndole band- 
ing patterns. Only Images showing uniform high intensity fluores- 
cence with minimal background staining were analyzed All 
centromeres, p arms of acrocentric chromosomes, and heterochro- 
matic regions were excluded from the analysis. 

RESULTS 

Comparative Genomic Hybridlzation-The CGH analysis 
identified a number, of chromosomal gains and losses In the 
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Table I 

Correlation between atterattoris detected by CQH and by expression monitoring 

inriJZi^l^ ^ tndepen^m variable (if CGH alteration - what expression ratio was found); bottom, altered expression used as 
independent variable ftf expression alteration - what CGH deviation was found). 



CGH alterations 



Tumor 733 vs. 335 
Expression change clusters 



Concordance CGH alterations 



13 Qain 



10 Loss 



10 Up-reguiation 

0 Down-regulation 

3 No change 

1 Up-regulation 

5 Qownrregulation 

4 No change 



Tumor 827 vs. 532 
Expression change clusters 



Concordance 



77% 
50% 



10 Gain. 
12 Loss 



Expression change clusters 



Tumor 733 vs. 335 
CQH alterations 



8 Up-regulatlon 
0 Down-regulation 

2 No change 

3 Up-regulation 

2 -Down regulation 
7 No change 



Concordance Expression change clusters 



Tumor 827 vs. 532 
CGH alterations 



80% 
17% 

Concordance 



16 Up-regulatlon 
21 Down-regulation 
15 No change 



11 Gain 

2 Loss 

3 No change 
1 Gain 

8 Loss 

12 No change 
3 Gain 

3 Loss 

9 No change 



89% 
38% 
60% 



17 Up-reguiation 
9 Down-regulation 
21 to change 



10 Gain 

5 Loss 

2 No change 
OGaln 

3 Loss 

6 No change 
1 Gain 

3 Loss 

17 No change 



59% 
33% 
81% 



two Invasive tumors {stage pT1, TCCs 733 and 827), whereas 
the two non-invasive papillomas (stage pTa t TCCs 335 and 
532) showed only 9p-, 9q22-q33- f and and 7+, 9q-, 
and Y-, respectively. Both Invasive tumors showed changes 
(1q22-24+, 2q14.1-qter-, 3q12-q13.3-, 6q12-q22-~, 
9q34-f . 11q12-q13+, 17+, and 20q11.2-q12+) that are typ- 
teal for their disease stage, as well as additional alterations, 
some of which are shown In Rg. 1. Areas with gains and 
losses deviated from the normal copy number to some extent, 
and the average numerical deviation from normal was 0.4-fold 
In the case of TCC 733 and 0.3~fold for TCC 827. The largest 
changes, amounting to at least a doubling of chromosomal 
content, were observed at 1q23 In TCC 733 (Rg. 1/1) and 
20q12lnTCC827(Rg, 1B). 

mRNA Expression in Relation to DNA Copy Number— The 
mRNA levels from the two Invasive tumors (TCCs 827 and 
733) were compared with the two non-Invasive counterparts 
(TCCs 532 and 335). This was done In two separate experi- 
ments In which we compared TCCs 733 to 335. and 827 to 
532, lespectively, using two different scaling settings for the 
arrays to rule out scaling as a confouncfihg parameter. Ap- 
proximately 1,800 genes that yielded a signal on the arrays 
were searched In the Unigene and Genemap data bases for 
chromosomal location, and those with a known location 
(1096) were plotted as bars covering their purported locus. In 
that way It was possible to construct a graphic presentation of 
DNA copy number and relative mRNA levels along the Indi- 
vidual chromosomes (Rg, i). 

For each mRNA a ratio was calculated between the level in 
the Invasive versus the non-invasive counterpart Bars, which 
represent chromosomal location of a gene, were color-coded 
according to the expression ratio, and only differences larger 



than 2-fold were regarded as informative (Fig. 1). The density 
of genes along the chromosomes varied, and areas contain- 
ing only one gene were excluded from the calculations. The 
resolution of the CGH method Is very low, and some of the 
outlier data may be because of the fact that the boundaries of 
the chromosomal aberrations are not known at high resolution. 

Two sets of calculations were made from the data For the 
first set we used CQH alterations as the Independent variable 
and estimated the frequency of expression alterations In these 
chromosomal areas. In general, areas with a strong gain of 
chromosomal material contained a cluster of genes having 
increased mRNA expression. For example, both chromo- 
somes 1q21-q25, 2p arid 9q, showed a relative gain of more 
than 100% In DNA copy number that was accompanied by 
increased mRNA expression levels In the two tumor pairs (Rg. 
1). In most oases, chrornosomaJ gains detected by CQH were 
accompanied by an increased level of transcripts In both 
TCCs 733 (77%) and 827 (80%) (Table I, top). Chromosomal 
losses, on the other hand, were not accompanied by de- 
creased expression In several cases, and were often regis- 
tered as having unaltered RNA levels (Table I, top). The Inabil- 
ity to detect RNA expression changes In these cases was not 
because of fewer genes mapping to the lost regions (data not 
shown). 

In the second set of calculations we selected expression 
alterations above 2-fold as the Independent variable and es- 
timated the frequency of CGH alterations In these areas. As 
above, we found that Increased transcript expression corre- 
lated with gain of chromosomal material (TCC 733, 69% and 
TCC 827, 59%), whereas reduced expression was often de- 
tected In areas with unaltered CGH ratios (Table I, bottom). 
Furthermore, as a control we looked at areas with no alter- 
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atlon In expression. No alteration was detected- by CGH In 
most of these areas (TCC 733, 60% and TCC 827, 81%; see 
Table i, bottom)* Because the ability to observe reduced or 
Increased mRNA expression clustering to a certain chromo- 
somal area clearly reflected the extent of copy number 
changes, we plotted the maximum CGH aberrations In the 
regions showing CQH changes against the ability to detect a 
change In mRNA expression as monitored by the oligonucleo- 
tide arrays (Fig. 2)iSs>t both tumors TCC 733 <p < 0.015) and 
TCC 827 (p < 0.00003) a highly significant correlation was 
observed between the level of CGH ratio change (reflecting 
the ONA copy number) and alterations detected by the array 
based technology (Fig. 2j> Similar data were obtained when 
areas with altered expression were used as Independent vari- 
ables. These areas correlated best with CGH when the CGH 
ratio deviated 1.6- to 2.0-fold (Table I, bottom) but mostly did 
not at tower CGH deviations. These data probably reflect that 
loss of an allele may only lead to a 50% reduction In expres- 
sion level, which Is at the cut-off point for detection of expres- 
sion alterations. Gain of chromosomal material can occur to a 
much larger extent 

Microsatelffte-based Detection of Minor Areas of Loss- 
es— In TCC 733, several chromosomal areas exhibiting ONA 
amplification were preceded or followed by areas with a nor- 
mal CGH but reduced mRNA expression (see Fig. 1 , TCC 733 
chromosome 1q32, 2p21 f and 7q21 and q32, 9q34, and 
10q22). To determine whether these results were because of 
undetected loss of chromosomal material In these regions or 



because of other non-structural mechanisms regulating tran- 
scription, we examined two mlcrosateliites positioned at chro- 
mosome 1q25-32 and two at chromosome 2p22. Loss of 
heterozygosity (LOH) was found at both 1q25 and at 2p22 
indicating that minor deleted areas were not detected with the 
resolution of CGH (Fig. 3). Additionally, chromosome 2p in 
TCC 733 showed a CGH pattern of gairVno change/gain of 
ONA that correlated with transcript increase/decrease/in- 
crease. Thus, for the areas showing increased expression 
there was a correlation with the DNA copy number alterations 
(Fig. 1A). As indicated above, the mRNA decrease observed in 
the middle of the chromosomal gain was because of LOH, 
Implying that one of the mechanisms for mRNA down-regu- 
lation may be regions that have undergone smaller losses of 
chromosomal material. However, this cannot be detected with 
the resolution of the CGH method. 

In both TCC 733 and TCC 827, the telomerlc end of chro- 
mosome 11p showed a normal ratio In the CGH analysis; 
however, clusters of five and three genes, respectively, lost 
their expression. Two microsatellltes (D11S1760, D11S922) 
positioned close to MUC2, IGF2, and cathepsin D Indicated 
LOH as the most likely mechanism behind the loss of expres- 
sion (data not shown). 

A reduced expression of mRNA observed in TCC 733 at 
chromosomes 3q24, llpH. 12p12.2, 12q21.1, and 16q24 
and in TCC 827 at chromosome 11p15.5, 12p11, 15q11.2, 
and 18q12 was also examined for chromosomal losses using 
mlcrosateliites positioned as close as possible to the gene loci 
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Fig. 3. Microsateiiite analysis of loss of heterozygosity. Tumor 
733 showing loss of heterozygosity at chromosome 1q25, detected 
fa) by D1S215 close to Hu class I Nstocompatlbility antigen (gene 
number 38 .In Fig. 1), (b> by 01S2735 close to cathepsln E (gene 
number 41 In Fig. 1), and (c) at chromosome 2p23 by D2S2251 dose 
to generaJ^-spectrtn (gene number 11 on Fig. 1) and of (d) tumor 827 
showing loss of heterozygosity at chromosome 16q12 by S18S1118 
close to mitochondrial 3<oxoacyl-coen2yme A thiolase (gene number 
12 In Fig. 1). The upper curves show the electropherognam obtained 
from normal DMA from leukocytes [N), and the lower curves show the 
electropherograrn from tumor DNA (7). In all cases one allele Is 
partially lost In the tumor amp! Icon. 

showing reduced mRNA transcripts. Only the microsateiiite 
positioned at 18q12 showed LOH (Fig. 3), suggesting that 
transcriptional down-regulation of genes In the other regions 
may be controlled by other mechanisms. 

Relation between Changes In mRNA and Protein Levels - 
2D-PAGE analysis, in combination with Coomassle Brilliant 
Blue and/or silver staining, was carried out on all four tumors 
using fresh biopsy material. 40 well resolved abundant known 
proteins migrating In areas away from the edges of the pH 
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Fie. 4. Correlation between protein levels as Judged by ZD- 
PAGE and transcript ratio* For comparison proteins were divided In 
three groups, unaltered In level or up- or down-regulated {horizontal 
axfe). The mRNA ratio as determined by oligonucleotide arrays was 
plotted for each gene {vertical axis), A, mRNAs that were scored as 
present In both tumors used for the ratio calculation; A, mRNAs that 
were scored as absent In the Invasive tumors (along horizontal axis) or 
as absent In non-Invasive reference {top of figure)* Two different 
scaRngs were used to exclude scaling as a confounder, TCCs 827 
and 532 (AA) were scaled with background suppression, and TCCs 
733 and 335 (#0) were scaled without suppression. Both compari- 
sons showed highly significant (p < 0.005) differences in mRNA ratios 
between the groups. Proteins shown were as foOows: Group A (from 
fefl), phosphoglucomutase 1 , glutathione transferase class $i number 
4, fatty acid-binding protein homologue, cytokeratln 15, and cyto- 
keratln 13; B (from left), fatty acld-Wndlng protein homologue, 28-kDa 
heat shock protein, cytokeratin 13, and calcyclin; C<fromteft), *-eno- 
lase, hnRNP B1, 28-kDa heat shock protein, 14-3-3-c and 
pre-mRNA splicing factor; D t mesotheHal keratin K7 (type II); £ (from 
fop), glutathione S-transferase-ir and mesotheOaJ keratin K7 (type II); 
F (from top and left), adenytyi cyclase-associated protein, E-cadherin, 
keratin 19, calgizzarin, phosphogtycerate routase, . annexin IV, cy- 
toskeletal yactln, hnRNP At, integral membrane protein calnexln 
OP90). hnRNP H, brain-type dathrtn light chaln-a, hnRNP F, 70-kDa 
heat shock protein, heterogeneous nuclear ribonucleoproteln A/B, 
transIationaJly controlled tumor protein, Over giyceratdehyde-3-phos- 
phate dehydrogenase, keratin 8, aldehyde reductase, and Na,K- 
ATPase 0-1 subunit, G, (from top and teff), TCP20. calgizzarin, 70- 
kDa heat shock protein, calnexln, hnRNP H, cytokeratin 15, ATP 
synthase, keratin 1 9, trkxsephosphate Isomerase, hnRNP F, Bver gryc- 
eraldehyde-3-phosphatase dehydrogenase, glutathione S-transfer- 
ase-ir, and keratin 8; H (from left), plasma gelsoHn, autoantigen cal- 
nstfcuBn, mioredoxin. and NAD+ -dependent 1 5 hydroxyprostaglandin 
dehydrogenase; / (from fop), prolyl 4-hydroxytase 0-eubuntt, cyto- 
keratln 20, cytokeratln 17, prohibition, and fructose 1,6-bJ phos- 
phatase; J annexin II; K, annexin IV; L (from top and left), 90-kDa heat 
shock protein, prolyl 4-hydroxytase 3-subunit, a-enolase, GRP 78, 
cyclophllln, and cofilln. 

gradient, and having a known chromosomal location, were 
selected for analysis in the TCC pair 827/532. Proteins were 
identified by a combination of methods (see "Experimental 
Procedures"). In general there was a highly significant corre- 
lation (p < 0.005) between mRNA and protein alterations (Fig. 
4). Only one gene showed disagreement between transcript 
alteration and protein alteration. Except for a group of cyto- 
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Fra. 5. Comparison of protein and transcript levels In Invasive 
and non-Invasive TCCs. The upper part of the figure shows a 2D gel 
(tefl) and the oligonucleotide array fright) of TCC 632. The red rectan- 
gles on the upper get highlight the areas that are compared below. 
Identical areas of 2D gels of TCCs 532 and 827 are shown below. 
Clearly, cytokeratlns 13 and 15 are strongly down-regulated In TCC 
827 (red annotation). The tile on the array containing probes for 
cytokeratin 15 Is enlarged below the array (red arrow) from TCC 532 
and Is compared with TCC 827. The upper row of squares )n each tile 
corresponds to perfect match probes; the lower row corresponds to 
mismatch probes containing a mutation (used for correction for un- 
speciflc binding). Absence of signal Is depicted as black, and the 
higher the signal the lighter the color. A high transcript level was 
detected In TCC 532 (6151 units) whereas a much lower level was 
detected in TCC 827 (absence of signals). For cytokeratln 13. a high 
transcript level was also present In TCC 632 (15659 units), and a 
much lower level was present In TCC 827 (623 units). The 2D gels at 
the bottom of the figure (fe/Q show levels of PA-FABP and adipocyte- 
FABP fen TCCs 336 and 733 (Invasive), respectively. Both proteins are 
down-regulated in the Invasive tumor. To the right we show the array 
tiles for the PA-FABP transcript A medium transcript level was de- 
tected In the case of TCC 335 (1277 units) whereas very low levels 
were detected In TCC 733 (166 units). IEF. Isoelectric focusing. 



keratins encoded by genes on chromosome 17 (Fig. 5) the 
analyzed proteins did not belong to a particular family. 26 well 
focused proteins whose genes had a know chromosomal 
location were detected In TCCs 733 and 336, and of these 1 9 
correlated (p < 0.005) with the mRNA changes detected using 
the arrays (Fig. 4). For example, PA-FABP was highly ex- 
pressed In the non-invasive TCC 335 but lost In the Invasive 
counterpart (TCC 733; see Fig, 5). The smaller number of 
proteins detected in both 733 and 335 was because of the 
smaller size of the biopsies that were available.- 

11 chromosomal regions where CQH showed aberrations 
that corresponded to the changes In transcript levels also 
showed corresponding changes In the protein level (Table II). 
These regions Included genes that encode proteins that are 
found to be frequently altered In bladder cancer, namely 
cytokeratlns 17 and 20, annexins II and IV, and the fatty 
acid-binding proteins PA-FABP and FBP1 Four of these pro- 
teins were encoded by genes in chromosome 17q, a fre- 
quently amplified chromosomal area In invasive bladder 
cancers. 

DISCUSSION 

Most human cancers have abnormal DNA content, having 
lost some chromosomal parts and gained others. The present 
study provides some evidence as to the effect of these gains 
and losses on gene expression In two pairs of non-invasive 
and invasive TCCs using high throughput expression arrays 
and proteomJcs, in combination with CGH. In general, the 
results showed that there Is a dear individual regulation of the 
mRNA expression of single genes, which in some cases was 
superimposed by a DNA copy number effect In most cases, 
genes located In chromosomal areas with gains often exhib- 
ited increased mRNA expression, whereas areas showing 
losses showed either no change or a reduced mRNA expres- 
sion. The latter might be because of the fact that losses most 
often are restricted to loss of one allele, and the cut-off point 
for detection of expression alterations was a 2-fold change, 
thus being at the border of detection. In several cases, how- 
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Table II 

Proteins whose expression level correlates with both mRNA end gene dose changes 



mosomal location 


Tumor TCC 


CGH alteration 


Transcript alteration* 


Protein altera 


1q21 


733 


Gain 


Abs to Pres - 


Increase 


2p13 


733 


Gain 


3.9-F6ld up 


Increase 


17q12-q21 


827 


Gain 


3.8-Fold up 


Increase 


17q21.1 


827 


Gain 


5.6-Fold up 


Increase 


8q21,2 


827 


Loss 


10-Fold down 


Decrease 


9q22 


827 


Gain 


2.3-Fold up 


Increase 


9q31 


827 


Gain 


AbstoPres 


Increase 


15q12-q13 


827 


Loss 


2.5-Fold up 


Decrease 


17q21 


827/733 


Gain 


3.7-/2.5-Fold up 6 


Increase 


17q25 


827/733 


Gain 


6.7V1 .6-Fold up 


Increase 


7p15 


827 


Loss 


2.5-Fold down 


. Decrease 



Annexin II 
Annexln IV 
Cytokeratln 17 
Cytokeratln 20 
{PA-)FABP 
FBP1 

Plasma gelsolin 
Heat shock protein 28 
Prohibltin 
Prolyl-4-hydroxyl 
hnRNPBI 



• Abs, absent; Pres, present. 

b In cases where the corresponding alterations were found in both TCCs 827 and 733 these are shown as 827/733. 
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ever, an increase or decrease in DNA copy number was 
associated with cte novo occurrence or complete loss of tran- 
script, respectively. Some of these transcripts could not be 
detected In the non-Invasive tumor but were present at rela- 
tively high levels In areas with DNA amplifications In the Inva- 
sive tumors (e.g\ in TCC 733 transcript from cellular llgand of 
annexin II gene (chromosome 1q21) from absent to 2670 
arbitrary Units; In TCC 827 transcript from small prollne-rich 
protein 1 gene (chromosome 1q12-q21,1) from absent to 
1326 arbitrary units). It may b$ anticipated from „these .data 
that significant clustering of genes with an increased expres- 
sion to a certain chromosomal area Indicates an increased 
likelihood of gain of chromosomal material in this area. 

Considering the many possible regulatory mechanlsrns act- 
ing at the level of transcription, it seems striking that the gene 
dose effects were so clearly detectable In gained areas. One 
hypothetical explanation may lie In the loss of controlled 
methylation In tumor cells (17-19). Thus, it may be possible . 
that In chromosomes with Increased DNA copy numbers two 
. or more alleles could be demethylated simultaneously leading 
to a higher transcription level, whereas in chromosomes with 
losses the remaining allele could be partly methylated, turning 
off the process (20, 21). A recent report has documented a 
ptokJy regulation of gene expression in yeast, but In this case all 
the genes were present In the same ratio (22), a situation that Is 
not analogous to that of cancer cells, which show marked 
chromosomal aberrations, as well as gene dosage effects. 

Several CGH studies of bladder cancer have shown that 
some chromosomal aberrations are common at certain 
stages of disease progression, often occurring in more than 1 
of 3 tumors. In pTa tumors, these include 9p-, 9q-, 1q+, Y- 
(2, 6), and ln.pT1 tumors, 2q-,11p-, 11q-, 1q+, 5p+, 8q+, 
17q+, and 20q+ (2-4, 6, 7). The pTa tumors studied here 
showed similar aberrations such as 9p- and 9q22-q33- and 
9q- and respectively. Mkewlse, the two minimal Invasive 
pT1 tumors showed aberrations that are commonly seen at 
that stage, and TCC 827 had a remarkable resemblance to the 
commonly seen pattern of losses and gains, such as 1 q22-24 
amplification (seen In both tumors), 1 1q14-q22 loss, the latter 
often linked to 17 q+ (both tumors), and 1q+ and 9p~, often 
linked to 20q+ and 11 q13+ (both tumors) (7-9). These ob- 
servations indicate that the pairs of tumors used In this study 
exhibit chromosomal changes observed in many tumors, and 
therefore the findings could be of general Importance for 
bladder cancer. 

Considering that the mapping resolution of CGH Is of about 
20 megabases it Is only possible to get a crude picture of 
chromosomal Instability using this technique. Occasionally, 
we observed reduced transcript levels dose to or inside re- 
gions with increased copy numbers. Analysis of these regions 
by positioning heterozygous microsateliites as close as pos- 
sible to the locus showing reduced gene expression revealed 
loss of heterozygosity In several cases. It seems likely that 
multiple and different events occur along each chromosomal 



ami and that the use of cONA mlcroarrays for analysis of DNA 
copy number changes will reach a resolution that can resolve 
these changes, as has recently been proposed (2). The outlier 
data were not more frequent at the boundaries of the CGH 
aberrations. At present we do not know the mechanism be- 
hind chromosomal aneuploldy and cannot predict whether 
chromosomal gains will be transcribed to a larger extent than 
the two native alleles. A mechanism as genetic Imprinting has 
an impact on the expression level In normal cells and Is often 
reduc^__ln_fe^ imprinting 
and gain of chromosomal material Is not known. 

We regard it as a strength of this Investigation that we were 
able to compare invasive tumors to benign tumors rather than 
to normal urothelium, as the tumors studied were biologically 
very close, and probably may represent successive steps in 
the progression of bladder cancer. Despite the limited amount 
of fresh tissue available ft was possible to apply three different 
state of the art methods. The observed correlation between 
DNA copy number and mRNA expression Is remarkable when 
one considers that different pieces of the tumor biopsies were 
used for the different sets of experiments. This indicate that 
bladder tumors are relatively homogenous, a notion recently 
supported by CGH and LOH data that showed a remarkable 
similarity even between tumors and distant metastasis (10, 23). 

. In the few cases analyzed, mRNA, and protein levels 
showed a striking correspondence although in some cases 
we found discrepancies that may be attributed to translations! 
regulation, posMranslatlonal processing, protein degrada- 
tion, or a combination of these. Some transcripts belong to 
undertranslated mRNA pools, which are associated with few 
translationally Inactive ribosomes; these pools, however, 
seem to be rare (24). Protein degradation, for example, may 
be very Important . In the case of polypeptides with a short 
half-life (e.g. signaling proteins), A poor correlation between 
mRNA and protein levels was found in liver cells as deter- 
mined by arrays and 2D-PAGE (25), and a moderate correla- 
tion was recently reported by Ideker et a!. (26) In yeast 
(Interestingly, our study revealed a much better correlation 
between gained chromosomal areas and Increased mRNA 
levels than between loss of chromosomal areas and reduced 
mRNA levels. In general, the level of CGH change determined 
the ability to detect a change in transcript) One possible 
explanation could be that by losing one allele the change In 
mRNA level Is not so dramatic as compared with gain of 
material, which can be rather unlimited and may lead to a 
severatfold increase In gene copy number resulting In a much 
higher Impact on transcript level, the latter would be much 
easier to detect on the expression arrays as the cut-off point 
was placed at a 2 -fold level so as not to be biased by noise on 
the array. Construction of arrays with a better signal to noise 
ratio may in the future allow detection of lesser than 2-fold 
alterations in transcript levels, a feature that may facilitate the 
analysis of the effect of loss of chromosomal areas on tran- 
script levels. 
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■ In eleven cases we found a significant correlation between 
DNA copy number, mRNA expression; and protein level Four 
of these proteins were encoded by genes located at a fre- 
quently amplified area In chromosome 17q.. Whether DNA 
copy number Is one of the mechanisms behind alteration of 
these eleven proteins is at present unknown and will have to 
be proved by other methods using a larger number of sam- 
ples. One factor making such studies complicated is the large 
extent of protein modification that occurs after translation, 
requiring immunoWentiflcation and/or mass spectrometiy to 
correctly Identify the proteins in tfe gels. 

In conclusion, the results presented in this study exemplify 
the large body of knowledge that may be possible to gather in 
the future by combining state of the art techniques that follow 
the pathway from DNA to protein (26). Here, we used a tradi- 
tional chromosomal CQH method, but in the future high reso- 
lution CGH based on microarrays with many thousand radiation 
hybrid-mapped genes will increase the resolution and informa- 
tion derived from these types of experiments (2). Combined with 
expression arrays analyzing transcripts derived from genes with 
known locations, and 2D gel analysis to obtain information at 
the post-translational level, a clearer and mote developed un- 
derstanding of the tumor genome will be forthcoming. 
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Impact of i)NA Amplification on Gene Expression Patterns in Breast Cancer 1 ' 2 
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ABSTRACT 

Genetic changes underlie tumor progression and may lead to cancer- 
specific expression of critical genes. Over 11 00 publications have de- . 
; scribed the use of comparative genomic hybridization (CGH) to analyze 
the pattern of copy number alterations in cancer, but very few of the genes 
affected are known. Here, we performed high^resolution CGH analysis on 
cDNA mkroarraya in breast cancer and directly compared copy number 
and mRNA expression levels of 13,824 genes to quantitate the Impact of ^ 
genomic changes on gene expression. We Identified and mapped the 
boundaries of 24 Independent amplicons, ranging in size from (U to 12 
Mb, Throughout the genome, both high- and low-level copy number 
changes had a substantial Impact on gene expression* with 44% of the V* 
Wghly ampfified genes showing overexpresslon and 10.5% of me highly 
overexpressed genes being amplified. Statistical analysis with random 
permutation tests identified 270 genes whose expression levels across 14 
samples were systematically attributable to gene amplification. Tfcese 
included most previously described amplified genes in breast cancer and 
many novel targets for genomic alterations, including tine BOXB7 gene, 
the presence of which in a novel amplicon at 17q2U was validated in 
1 <U% of primary breast cancers and associated with poor patient prog- 
nosis. In conclusion, CGH on cDNA microarrays revealed hundreds of 
nova genes whose overexpression Is attributable to gene amplification. 
These genes may provide insights to the clonal evolution and progression 
0f breast cancer and highlight promising therapeutic targets. 

INTRODUCTION 

Gene egression patterns revealed by cDNA nucroarrays have 
facilitated classification of cancers into biologically distinct catego- 
rieSi some of which may explain the clinical behavior of the tumors 
(1-6). Despite this progress in diagnostic classification, the molecular 
mechanisms underlying gene expression patterns in cancer have re- 



mameo^ elusive, and me utility of gene expression profiling in the 
identification of specific merapeufic tatgets remains Jmutea^* : * 
Accumulation of genetic defects is thought to underlie the clonal 
evolution of cancer. Identification of the genes that mediate, the effects 
of genetic changes may be important by higMighting transcripts that 
are actively involved in tumor progression. Such transcripts and their 
encoded proteins would be ideal targets for anticancer therapies, as 
demonstrated by the clinical success of new therapies against ampli- 
fied oncogenes, such as ERBB2 mi EGFR'tf, &), in breast cancer and 
other solid tumors. Besides amplifications of known oncogenes, oyer 
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20. recurrent regions of DNA amplification have been mapped in 
breast cancer by CGH 5 (9, 10). However, these amplicons are often 
lafge and poorly defined, and their impact on gene expression remains 
unknown* 

We hypothesized that genome-wide identification of those gene 
expression changes that are attributable to underlying gene copy 
number alterations would highlight transcripts that are actively in- 
volved in the causation or maintenance of. the inalignant phenotype. 
To identify such transcripts, we applied a combination of cDNA and 
CGH nucroarrays. to: (a) determine the global impact that gene copy 
number variation plays in breast cancer developmentand progression; 
and (b) identify and characterize those genes, whose mRNArapres- 
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5 The fbbrevktiom used arerCOH, comparative genomic hybridization; FISH, fluo- 
rescence in situ hybridization; RT-PCR, reverse transcripuoq-PCR. 
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WU5*££ ^SZSS 5 & ^ ISPM* «■ <m™*« »*> profile 

of the cDNA Clones along the human genome, to AtoZZ <Kh£ ^ n ^ rosrra y- »• «W "mfc* ratio, wot plotted as . ftnc&Tof ftnM&S 

and dark red dots, 

__i V — • — - "7" w hkw jt» or me expression ratios 

is a^j shown at (hetotoai rfthe figure, ^ chrofflosomeboSiefTO 



sion is most signfficanfly associated with amplification of the corre- 
sponding genomic template. 

MATERIALS AND METHODS 



were excluded from the analysis and were treated as missing Values. The 
distributions of fluorescence ratios were used to define cutpoints for increased/ 

JST^!^ • ^ 0011 rad0 >lM ("presenting the upper 
5% of die COH ratios across all experiments) were considered to be amplified, 
and genes with ratio <0;73 (representing the fewer 5%) were considered tote 
deleted. 

Statistical Analysis of CGH and c»NA Mleroarray Data. To evaluate 
fte influence of copy number alterations on gene expression, we applied the 
following statistical approach. CGH and cDNA calibrated intensity ratios were 
log-transformed and normalized using median centering of the values in each 
ceUlme. Furthermore, cDNAratios for each gene across all 14 cell lines were 
median centered. For each gene, die COH data wero represented by a vector 
that was labeled 1 for amplification (ratio, >l.43) and 0 for no amplification. 
Amplification was correlated with gene expression using the signal-to-noise 



' °si^ »s» 



a,? SSSiT Unei Fourtcen cancer cell lines (BT-20, BT- 
^"f^i" 85 ^ I ^ CF7 ' MDA-361. MDA^6,MDA^53. MDA-468 
SKBR-3, WTDt UACC81?, ZR-75,1. and ZR-75-30) were obtained from S 
American Type Culture Collection (Manassas; VA). Cells were grown under 
recommended culture conditions. Qenomic DNA and mRNA were isolated 
using standard protocols. 

Copy Number and Expression Analyses by cDNA Mleroarraya. The 
Ptepmdoii and fainting of the 13,824 cDNA clones on glass slides were 

cm^l^T* '^"^ corresponded to 1^^ 

^*<»fDNAntiaoMmyswered<measdesdbed(14, • ^* 

u3 "* of genomtc DNA from breast cancer cell lines and normal 

tamsn WBCs were digested for 14-18 h with Akl uti Aral (Life Technol- 
, Rockvflle, MD) and purified by phenol/chloroform extraction. Six 
m of digested cell line DNAs were labeled with Cy3-dUIP (Amersham 
Pham»acia) and normal DNA with CyS-dUTP (Amersham Pharmacia) using 
AeJ^prune Labeling kit (Life Technologies, Inc.). Hybridization (14, 15)ana 

n£"^?;?!!^L (,3) » « te «ribed. For the expression 

anuses, a standard reference (Universal Human Reference RNA; Stratagene 

uJ^ ^HT2w£ * * ex P erim «« 8 - *>rty M of reference RNA were 
^*Cy3^UTP and 3.5 fig of test. mRNA with CyS^tUTP, and the 
labeled cDNAs were hybridized on mlcroarrays as described (13, 15). For both 
micmarray analyses, a laser confocal scanner (Agilent Technologies, Palo 
Alto. CA) was used to measure the fluorescence intensities at the target 
locations using the DEARRAY software (16). After background subtraction 
avenge Intensities 8t each clone in the test hybridization were divided by the 
avenge intensity of the.cnnesponding clone in me control hybridization. For 
«ie copy number analysis, the ratios were normalized on the basis of the 
datamation of ratios of all targets on the army and for the expression analysis 
on the basis of 88 housekeeping genes, which were spotted four times onto the 

S^^Z I meaSUrementS {i e - number data with mean reference 
intensity <I00 fluorescent units, and expression data with both test and 
reference intensity <100 fluorescent units and/or. with spot size <50 units) 



where m,„ a gl and «•„ denote the means and SDs for tile expression 
levels for amplified and nonamplified cell lines, respectively. To assess the 
statistical significance of each weight, we performed 10,000 random petnio- 
tations of the label vector. The probability that a gene had a larger or equal 
weight by random permutation man the original weight was denoted by a. A 
low «r(<0.05) indicates a strong association between gene expression and 
• amplification. 

Genomic localization of cPNA Clone* and Amplicoir Mapping, Each 
cDNA clone on the microarray was assigned to a Unigene cluster using the 
Unigene Build 14J.«.A database of genomic sequence alignment information 
for mRNA sequences was created from the August 2001 freeze of the Uni- 
versity of California Santa Cruz's GoldenPath database. 7 The chromosome and 
bp portions for each cPNA clone were then retrieved by relating these data 
sets, Amplicons were defined as a CGH copy number ratio >2.0 in at least two 
adjacent clones in two or more cell lines or a CGH ratio >2X> in at least three 
adjacent clones in a single cell line. The amplicon start and end positions were 
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Table I Summary of independent ampUcons in J4 breast cancer cell lines by 
CGH miaxH u r uy 



Location 



Start (Mb) 



lpI3 
101 
lq22 
3pl4 

tyZI-7pll.2 

7q31 

7q32 

8a2MMcj2M3 
8q213 

*g33-*q24.!4 

8q24J22 

9p!3 

i3<£2-q31 . 
I7qll 

I7ql^-q21.2 

17q2U2-q2l.33 

!7q22-423.3 

17<£33-h04.3 

19ql3 

20qim 

20ql3.12 

20qI3.12-^13J3 

20qm-<it3J2 



End (Mb) 



Size (Mb) 



132.79 

71,94 
55.62 
125.73 
140.01 
86.45 
.98.45 
129.88 
15121 
38.65 
77.15 
86,70 
2930 
39.79 
52.47 
63.81 
69.93 
40.63 
.34.59 
44*00 
46.45 
51,32 



132.94 
17745 
lf957 
74.66. 
60.95 
130^6 
140.68 
92.46 
103.05 
142,15 
15Z16 
.39.25. 
8138 
87JS2 
30L85 
42.80 
55.80 
69.70 
74.99 
41.40 
35.85 
45.62 
49.43 
59H2 



,02 
33 
03 
2.7 
53 
52 
0.7 
6.0 
4.6 

123 
1.0 
0.6 
42 
t>.9 
1.6 
3.0 
33 
55 
5.1 
0.8 

.13 
1.6 
3.0 

. 7.8 



I to include nci^ibpring nonampUficd clones (ratio, <\S). The am- 
pHcon size determination was partially dependent on local clone density. 

FISH. Dual-color interphase FISH to breast cancer cell lines was done as 
described (17). Bacterial artificial chromosome clone RP11-361K8 was la- 
beled with SpectrumOrange (Vysis, Downers Grove, XL), and Spectrurn- 
Orangshlabeled probe for EGFR was obtained fiorh Vysis. SpectrumGreen- 
labeled chromosome 7 and 17 centromere probes (Vysis) were used as a 
inference. A tissue microarray containing 612 formalin-fixed, paraffin-embed- 
ded Primary ***** cancers (17) was applied in FISH analyses as described 
' < 18 )- 1** of these specimens was approved by the Ethics Committee of the 
University of Basel and by <to NHL Specimens containing a 2-fold or higher 
increase in the number of test probe signals, as compared with corresponding 
centromere, signals, in at least 10% of the tumor cells were considered to be 
amplified. Survival analysis was performed using the Kaplan-Meier method 
and the log-rank test 

KT-PCR. The HOXB7 expression level was -determined relative to 
GAPDH. Reverse transcription and PGR amplification were performed using 
Access RT-PCR System (Promega Corp., Madison, WI) with 10 ngofmRNA 
as a template. HOXB7 primers were 5'-GAGCAOAGGGACTCOOACTT-3 ' 
and 5'-G<XmVVOOTAGC<^TTC3TA0.3'. 

RESULTS . 

Global Effect of Copy Number on Gene Expression. 13,824 
arrayed cDNA clones were applied for analysis of gene expression 
and gene copy number (CGH microarrays) in 14 breast cancer cell 
lines. The results Illustrate a considerable influence of copy number 
on gene expression patterns. Up to 44% of the hi^uy amplified 
transcripts (CGH mtio, >2.5) were overexpressed (Le., belonged to 
the global upper 7% of expression ratios), compared with only 6% for 
genes with normal copy number levels (Fig, 1 A). Conversely, 10.5% 
of the transcripts, with high-level expression. (cDNA ratio, >10) 
showed, increased copy number (Fig. IB). Low-level copy number 
increases and decreases were also associated, with similar, although 
less dramatic, outcomes on gene expression (Fig. 1). 

Identification of Distinct Breast Cancer AmpUcons. Base-pair 
locations obtained for 1 1,994 cDNAs (86.8%) were used to plot copy 
number changes as a function of genomic position (Fig. 2, Supple- 
ment Fig. A). The average spacing of clones throughout me genome 
was 267 kb, This nigh-resolution mapping identified 24 independent 
breast cancer amplicons, spanning from 02 to-12 Mb of DNA (Table 
1). Several amplification sites detected previously by chromosomal 



CGH were validated, with lq21, 17ql2-n21.2 k 17q22-q23, 20ql3 i 
and 20ql3.2 regions being most commonly amplified Furthermore' 
me boundaries of these amplicons were precisely delineated. In ad- 
dition, novel amplicons were identified at.9pl3 (38.65-39 25 mkV 
andl7q213(52.47-r55,80Mb). ■ ^ 

Direct Identification of Putatiye Amplification target Genes. 
The cDNA/CGH microarray technique enables the direct eoireli*. 
tion of copy number and expression data on a gene-by-gene basis 
throughout the genome. We directly annotated high-resolutioii 
CGH plots with gene expression data using color coding. Fig, 2C 
shows that most of the amplified genes in the MCF-7 breast cancer 
cell line at lpl3, 17q22-q23, and 20ql3 were highly overex- 
pressedr A view of chromosome 7 in; the ly©A^468 cell lfii 
implicates EGFtt as me most highiy oyerexpressedi and amplified 
gene at 7pl l-pl2 (Fig. 3^). In BT-474, the two known amplicons 
at 17ql2,and 17q22-q23 contained numerous highly overex- 
pressed genes (Fig. 32?). Ih addition, several genes, inemding the 
honteobox genes HOXB2 and HOXB7, were highly amplified in a 
. previously undescribed independent amplicon at, 17q2i.3. HOXB7 
was systematically anmiiiied (as validated by FISH, Fig. 3B. inset) 
as well as overexpressed (as verified:by RT-PCR, data not shown) 
inBT-474, 1JACC812, and2R-75-30 cells. Furthermore, this n6vd 
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Fig. 1 Annototton of geneexp^ 
7pll^l2 Mnphcon fame MDA-468 cell line ate highly expressed (red dots) ana* include 
the EGFR oncogene. £. several genes in the I7ql2, 17q2l3, and 17<j23 ampUcons in me 
BT-474 breast cancer cell line are highly overexpressed {red) and include fee HOXB7 
gene. The datotobels and color coding are as indicated for Fig. 2C Insets show 
chromosomal <X3H profiles for the corresponding chromosomes and validation of me 
increased copy number by interphase FISH using BGFR (red) and chromosome 7 
centromere probe (green) to MDA-468. (*) and HOXB7<ptcinc probe (red) and chro- 
mosome 17 centromere (green) to BM74 cells (B). r ' 
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. level ^opy number increase. Low-level copy nmnber gains and losses 
also had a significant influence on exptessijm levels of genes in the 
legions affected, but these effects were more subtle on a gene-by-gene 
basis than those of high-level amplifications. However, the impact of 
low-level gains on the dysregulation of gene expression patterns in 
cancer may be equally important if not more inqportant than that of 
Mgh-level amplifications. Aneoploidy and low-level gains and losses 
of chromosomal arms represent tiie most common types of genetic 
alterations in breast and other cancers and, therefore, have an influ- 
ence on many genes. Our results in breast cancer extend the recent 
studies on the impact of aneuploidy on global gene expression pat- 
terns in yeast cells, acute myeloid leukemia, and a prostate cancer 
model system (22-24).- _ 

The CGH microairay analysis identified 24 independent breast 
cancer arnpli<?ons. We defined the precise boundaries for t many am- 
plteons detected previously by chromosomal CGH (9, 10, 25, 26) and 

, also discovered novel amplicons that had not been detected previ- 
ously, prestunably because of toeir small size (only 1-2 Mb) or close 
proximity to other larger atnplicons. One of these novel amplicons 
involved the homeobox gene region at 17q21 3 and led to the over- 
expression of the HOXB7. md HOXB2 genes. The homeodomain 
transcription fectara toe-lcnown to be key regulators of embryonic 
development and have been occasionally reported to undergo aberrant 
expression in cancer (27, 28). HOXB7 transection induced cell pro- 
liferation in melanoma, breast, and ovarian cancer cells and increased 
tomorigenicity and angiogenesis in breast cancer (29-32), the pres- 
ent results imply that gene amplification may be a prominent mech- 
anism for overexpressing HOXB7 in breast cancer and suggest that 
HOXB 7 contributes to tumor progression and confers an aggressive 
disease phenotype in breast cancer. This view is supported by our 
finding of amplification of HQXB7 in 10% of 363 primary breast 
cancers, as well as an association of amplification with poor prognosis 

: of the patients. 

We carried out a systematic search to identify genes whose 
expression levels across all 14 cell line* were attributable to 
amplification status. Statistical analysis revealed 270 such genes 
(representing -2% of all genes pn the array), including not only 
previously , described amplified genes, such as HER-2, MYC, 
EGFR, ribosomal protein s6 kinase, and AIB3 9 but also numerous 
novel genes such as NRA&related gene (lpl3), syndecan-2 (Sq22), 
and bone morphogeny protein (20ql3.1), whose activation hy 
amplification my similarly promote breast cancer progression. 
Most of the 270. gend have not been implicated previously in 
breast cancer development and suggest novel pathogenetic mech- 
anisms. Although we would not expect all of them to be causally 
involved, it is intriguing that 84% of the genes with associated 
functional information were implicated in apoptosis, cell prolifer- 
ation, signal transduction, transcription, or other cellular processes 
that could directly imply a possible role in cancer progression. 
Therefore, a detailed characterization of these genes may provide 
biological insights to breast cancer progression and might lead to 
the development of novel therapeutic strategies. 

In summary, we. demonstrate application of cDNA microarrays 
to the analysis of both copy number and expression levels of over 
12,000 transcripts throughout the breast cancer genome, roughly 
once every 267 kb. This analysis provided: (a) evidence of a 
J, prominent global influence of copy number changes on gene 
expression levels; (b) a high-resolution map of 24. independent 
aniplicons in breast cancer; and (c) identification of a set of 270 
genes, the overexpression of which was statistically attributable to 
gene amplification: Characterization of a novel amplicon at 
17q2U implicated amplification and oyerexpression of the 
HOXB7 gene in breast cancer, including a clinical association 



between HOXB 7 amplification and poor patient prognosis. Overall 
our results illustrate how the identification of genes activated by 
gene amplification provides a powerful approach to highlight 
genes with an important role in cancer as well as to prioritize and 
validate putative targets for therapy development 
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Genomic DNA copy number alterations are key genetic events in 

SLtT° PnMint P"*"^" of human cancers. Here *ve 
report a genome-wide microarray comparative genomic hvbrld- 

a series of primary human breast tumors. We have profiled DNA 

^^TTZ?™ 6 ™ ma "** b^genes. CnZ 
predominantly advanced, primary breast tumors and 10 breast 
cancer ceD lines. White the overall patterns of DNA ampUtoUon 
^.^°, n corroborate previous cytogenetic studies, the high- 
resolution (gene-by-gene) mapping of amplkon boundaries and 
quantitative analysis of ampBcon shape provIdeTanifiamt 
Jnptwement In the localization of candidate o^TneSarZ 
microarray measurements of mRNA levels reveal the remarkable 

va1aTon\ , t« Varia,l !r ln , flen6 »«" nUmbe ' £X£Z 
*at of h«^' tPreS ^ n J n tumor eens - Specifically, we find 
° f highly ampnfled genes show moderately or highly 

2* DNA ^ number influences gene ex- 
pre^on across a wide range of DNA copy number alterations 

ZSEffSEE?* ? PV nUrnber b **od*ed with a corre- 
1 «T2^b !l 1l!C 9e mRNA leveIs - ««t overall, at least 
1 variation In gene expression among the breast 
2H attributable to underlying variation*, genecow 

munber. These findings provide evidence that widespread DNA 
copy number alteration can lead directly to global deregulation* 
P^ire^oTof 0 ^ ^ t ° ntribUte *° <teve, °PnH.nt or 

Conventional cytogenetic techniques, including comuaiative 
Wnomfc hyWdtodoa (CGH) (1), have led to the Efi 

jr**"" of w*™'* «8tons of DNA copy number 
*?^°" fa breast «ncer cell lines and tumors ( 2 i-4). While 

teT wmf'J^T^S^ known or candidate oncogenes 
\vfc™ m l i 8 P">'MI C («q24), CCND1 (n q i 3 ), ERBB2 
f»S?%.*??v 2NF217 (20ql3)J and tumor suppress* genes 
2^ Tf P Cl7pl3)], tiie reIevantgS>S 

S^EI (< l*' ff^jf > ^ and 17 q22~24. a«o loss of 
8p) remain to be identified. A high-resolution genome-wide 
nwp, dehneating the boundaries of DNA copy nuniberto 
Mions in tumors, should facilitate the localization and identifi- 
er °fn°SS^!!l md 'r or 8u PP ressor in breast 
cancer. In this study, we have created such a map usine 
array^ased CGH (5-7) to profile DNA copy numbeTaJterX 
Ml a series of breast cancer cell lines and primary turaore 

iw£ ^ resolvcd K (lue ? tion fa «t«t to which the widespread 
DKA cp mmber changes that we and others have identified 
repS^c L ° K "'if'^P^on of genes within involved 

ft™. T 7£ had . measured mRNA ,evels *» parallel in 
fte same samples (8), using the same DNA microarrays, we had 

h^^^'y acpl °? on " « enom ' c scale the relationship 
between DNA copy number changes and gene expression. From 
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™L^?™a 6 haVC ld ?ntificd a significant impact of wide- 
spread DNA copy number alteration on the transcriptional 
programs of breast tumors. v 81 

Materials and Methods 

ZaraT? £ZH w S ' Pt T7 • hre * t tumots predominantly 
large (>3 cm), intermediate-grade, infiltrating ductal carcino- 
ma^ with more than 50% being lymph node 

D^r. 0 /^° r , CC, , ,S Withi " sA^ns averaged P at .e^S<S> 
Details of individual tumors have been pubhshed (8, 9Y and 
are summarized in Table 1, which is published as inmfa? 
mformauon on the PNAS web she, www;pnas.otg. Brea^cTce? 
cehhnes were obtained from the American T^G^l 

fo u 3w NA , COl r DmS, 0 - f * P^nol/chloroform Ixtraction 
touowed by ethanol precipitation; 

DNA Labeling and Microarray Hybridizations. Genomic DNA label- 
tog and hybndizations were performed essentially as described 
m Pollack * (7) with slight modifications. Two mSnmS 
was labeled in a total volume of 50 microliters aldthe 
volumes of all reagents were adjusted accordingly. Test" DNA 
(frim tumors and cell lines) was f hiorescentiy labeled (Cy5) and 
hj*ndi2ed to a human cDNA microarray containing 6,m 
dtfferent mapped human genes (i.e n UniGene clusteii). The 
reference" (labeled with Qr3) for each hybridization was no£ 

3^553? 6 ,CUk0Cyte DN A a donor - fabrication 
mDMi microarrays and the labeling and hybridization of 
mRNA samples have been described (8). 

Data Analysis and Map Positions. Hybridized arrays were scanned 

^r^^. raU0 for a "- ^ e,en,ents ^ *> «• 
men ts with fluorescence intensities more than 20% above back- 
ground were considered reliable. DNA copy number profiles f 
that deviated significantly from background ratios mcaSta 

&S MntWl hybri^dons werefa^retedas 
evidence of real DNA copy number alteration (see Estimating 
Swtfcance of ' Abend Fluorescence Ratios in tiie supportinf 
taformaton). When indicated, DNA copy number profiles a^ 
delayed as a moving average (symmetries-nearest neighbors^ I 
Map positions for arrayed human cDNAs were assfened by 

Abbreviation: CGH, comparative genomic hybridization. 

rTo whom reprint requests should be addressed at tvmar, „, . , 

CA 9430S-S176. t-mafl; I«llackl<hunford.edu ^ 
•♦Present address: Zyomy, Inc. Hayward, CA 94545. 
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S^a^nlfnt^r^^ column represents 

feW^ESSS^ fluorescence ratios (test/reference) are^e^^^ ^S^Sec^S 

■EH?*?^ h " n,n ? seen « reflects fc.W-deletton. and^CLte^o^S!^.? J^^^' luminescence reflects 



l^ n^ Staitmg P 05 ** 0 " "f ** ^ and longest match of 
eK^int qUen ^ ******** «* corresponding UmGene 
2w/i 10) Hab>St MGo,den p « h " genome assembly 
(http://genotne.ucsc.edtt/; Oct 7, 2000 Freele). For UniGene 
^en represented by multiple arrayed elements, meanS 
S^T^lf ^»Se UniGe£ 
™Sf } ^ "P 0 * 60 - For mRNA measurements, fluorescence 
?^f n,ered " (Le - reported relatire to thefa£n 
mt.o across the 44 tumor samples). The data set described here 
can be accessed in its entirety in the supporting information! 
Results 

^performed CGH on 44 predominantly locally advanced, 
primary breast tumors and 10 breast cancer cell lines, using 
cDNA m.croarrays containing 6,691 different mapped huma^ 
£^ £j ?™ see Materials and ^exfrfor details of 
f8* rW »*»«)- To take full advantage of the im- 
proved spatial resolution of array CGH, we ordered (f luores- 

S? S£F> ,hc 6,691 cDNAs ac ~ rdin * t0 *• 

h ™„ ( htt P-//genome.ucsc,edu/) genome assembly of the draft 
S^SSr 8e,Be * Ce8(tl) - 80 doing* arrayed cDNAs not 
~n*^f. resent genes of potential interest (eg, 

SSElSKf I""" bu « 8150 P«*te Precise 

eenetic landmarks for chromosomal regions of amplification and 

| »www^i^c«^cgl/d(>l/iai073/pnas.ie471999 



deletion. Parallel analysis of DNA from cell lines containing 
different numbers of X chromosomes (Fig. lb), as we did before 
(7), demonstrated the sensitivity of our method to detect sfaele- 

2^-fold (49,XXXXX) gams (also see Fig. 5, which is published 
as supporting mformation on the PNAS web site), fluorescence 
ratios were linearly proportional to copy number ratios, whkh 
were slightfy ^derestimated, in agreement with prevtouTob- 

^Sfi!f ^L^T rouS DNA c °Py n^ber alterations were 
evident fa i both the breast cancer cell lines and primary tumors 
f^g. In), detected m the tumors despite the presence of euploid 
non-tumor cell types; the magnitudes of the observed changes 
were generally lower in the tumor samples. DNA copy-number 
alterations were found fa every cancer cell line and tumor, and 
on every human chromosome in at least one sample. Recurrent 
regions of DNA copy number gain and loss were readily iden- 
tifiable. For ©ample, gains within lq, 8q, 17q, and 20q were 
o^e™** "> a high proportion of breast cancer cell lines/tumors 
(90%/69%, 100%/47%, 100%/60*. and 90%/44^pSe 
ly), as were losses within lp, 3p, 8p, and 13a f80%/24% 

^ /22 %\*°T° /22% ' *** ^/18%. resp^^ry)S2 
with published cytogenetic studies (refe. 2^4; a complete Hstfag 
of gains/losses is provided In Tables 2 and 3, which are published 
as supporting information on the PNAS web site). The total 
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number of genomic alterations (gains and losses) was found to 
Jjes^ificantly higher to breast tumors thatwere high grade (P « 
0.008), consistent with published CGH data (3), estrogen recep- 
tor negative (P « 0,04), and harboring TP53 mutations (/> » 
0.0006) (see Table 4, which is published as supporting informa- 
tion on the PNAS web site). 

The improved spatial resolution of our array CGH analysis is 
illustrated for chromosome 8, which displayed extensive DNA 
copy number alteration in our series. A detailed view of the 
variation in the copy number of 241 genes mapping to chromo- 
some 8 revealed multiple regions of recurrent amplification; 
each of these potentially harbors a different known or previously 
uncharacterized oncogene (Fig. 2a). The complexity of amplicbn 
structure is most easily appreciated in the breast cancer cell line 
SKBR3; Although a conventional CGH analysis of 8q in SKBR3 
identified onfy two distinct regions of amplification (12), we 
observed three distinct regions of high-level amplification (la- 
beled 1-3 in Fig. 2b). For each of these regions we can define the 
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boundaries of the interval recurrently amplified in the tumors we 
examined; in each case, known or plausible candidate oncogenes 
can be identified (a description of these regions, as well as the 
recurrently amplified regions on chromosomes 17 and 20, can be 
found in Figs. 6 and 7, which are published as supporting 
information on the PNAS web site). * 

For a subset of breast cancer cell lines and tumors (4 and 37, 
respectively), and a subset of arrayed genes (6,095), mRNA 
levels were quantitatively measured in parallel by using cDNA 
microarrays (8). The parallel assessment of mRNA levels is 
useful in the interpretation of DNA copy number changes. For 
example, the highly amplified genes that are also highly ex- 
pressed are the strongest candidate oncogenes within an ampli- 
con. Perhaps more significantly, our parallel analysis of DNA 
copy number changes and mRNA levels provides us the oppor- 
tunity to assess the global impact of widespread DNA copy 
number alteration on gene expression in tumor cells. 

A strong influence of DNA copy number on gene expression 
is evident in an examination of the pseudocolor representations 
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of DNA copy number and mRNA levels for genes on chromo- 
some 17 (Fig. 3). The overall patterns of gene amplification and 
elevated gene expression ate quite concordant; Le^ a significant 
fraction of highly amplified genes appear to be correspondingly 
Highly expressed. The concordance between high-level amplffi- 
cahon and increased gene expression is not restricted to chro- 
mosome 17. Genome^wide, of 117 high-level DNA amplifica- 
tions (fluorescence ratios >4, and representing 91 different 
genes), (representing 54 different genes; see Table 5, which 
is published as supporting information on the PNAS web site) 
are found associated with at least moderately elevated mRNA 
levels (mean-centered fluorescence ratios >2), and 42% (rep- 
resenting 36 different genes) are found associated with compa- 
rably highly elevated mRNA levels (mean-centered fluorescence 
ratios >4). 

To determine the extent to which DNA deletion and lower- 
level amplification (in addition to high-level amplification) are 
also associated with corresponding alterations in mRNA levels, 
we performed three separate analyses on the complete data set 
(4 cell lines and 37 rumors, across 6,095 genes). First, we 
determined the average mRNA levels for each of five classes 
of genes, representing DNA deletion, no change, and low-, 
medium-, and high-level amplification (Fig. 4a). For both the 
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breast cancer cell lines and tumors, average mRNA levels 
tracked with DNA copy number across all five classes, in a 
statistically significant fashion (P values for pair-wise Student's 
rtests comparingaalacent classes: cell lines, 4 x 10-^,1 XIO" 49 . 
5 X 10-* 1 X KM; tumors, 1 X 10" 43 , 1 x 5 X 10^ 

1 X 10~ 4 ). A linear regression of the average tog(DNA copy 
number), for each class, against average log(mRNA level) 
demonstrated that on average, a Mold change in DNA copy 
numberwas accompanied by 1,4- and l^foldchangesinmRNA . 
level for the breast cancer cell lines and tumors, respectively (Fig. 
4a, regression line not shown). Second, we characterized the 
distribution of the 6,095 correlations between DNA copy num- 
ber and mRNA level, each across the 37 tumor samples (Fig. 4b) 
The distribution of correlations forms a normal-shaped curve 
but with the peak markedly shifted in the positive direction from 
zero. This shift is statisticalfy significant, as evidenced in a plot 
of observed vs. expected correlations (Fig, 4c), and reflects a 
pervasive global influence of DNA copy number alterations on 
gene expression. Notably, the highest correlations between DNA 
copy number and mRNA level (the right tail of the distribution 
in Fig. 4b) comprise both amplified and deleted genes (data not 
shown). Third, we used a linear regression model to estimate the 
fraction of all variation measured in mRNA levels among the 37 
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tumors that could be attributed to underlying variation in DNA 
copyuumt?ei\ From this analysis, we estimate that, overall, about 
7% of all of the observed variation in mRNA levels can be 
explained directly by variation in copy number of the altered 
genes (Fig. Ad). We can reduce the effects of experimental 
measurement error on this estimate by using only that fraction 
of the data most reliably measured (fluorescence intensity/ 
background >3); using that data, our estimate of the percent 
variation in mRNA revels directly attributed to variation in gene 
copy number increases to 12% (Fig. This still undoubtedly 
represents a significant underestimate, as the observed variation 
to global gene expression is affected not only by true variation in 
the expression programs of the tumor cells themselves, but also 
oy the variable presence of non-tumor cell types within clinical 
samples. 

Discussion 

This genome-wide, array CGH analysis of DNA copy number 
alteration in a series of human breast tumors demonstrates the 
Usefulness of defining amplicon boundaries at high resolution 
(gene-by-gene), and quantitatively measuring amplicon shape, to 
assist in locating and identifying candidate oncogenes. By ana* 
lyziag mRNA levels in parallel, we have also discovered that 
changes in DNA copy number have a large, pervasive, direct 
effect on global gene expression patterns in both breast cancer 
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cell lines and tumors. Although the DNA microarrays used in our 
analysis may display a bias toward characterized and/or highly 
expressed genes, because we are examining such a large fraction 
of the genome (approximately 20% of all human genes), and 
because, as detailed above, we are likely underestimating the 
contribution of DNA copy number changes to altered gene 
expression, we believe our findings are likely to be generalizable 
(but would nevertheless still be remarkable if only applicable to 
this set of -6,100 genes), 3 ^ 

In budding yeast, aneuploidy has been shown to result in 
chromosome-wide gene expression biases (13), Two recent 
studies have begun to examine the global relationship between 
DNA copy number and gene expression in cancer cells. In 
agreement with our findings, Phillips et at. (14) have shown that 
with the acquisition of tumorigenicity in an immortalized pros- 
tate epithelial cell line, new chromosomal gains and losses 
resulted in a statistically significant respective increase and 
decrease in the average expression level of involved genes. In 
contrast, Platzer et al. (15) recently reported that in metastatic 
colon tumors only -4% of genes within amplified regions were 
found more highly (>2-fbld) expressed, when compared with 
normal colonic epithelium* This report differs substantially from 
our finding that 62% of highly amplified genes in breast cancer 
exhibit at least 2-fold increased expression. These contrasting 
findings may reflect methodological differences between the 
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studies. For example, the study of Piatzer et al (15) may have 
systematically under-measured gene expression changes. In this 
regard it k remarkable that only 14 transcripts of many thousand 
r ^J?8 ^thin unamplified chromosomal regions were found to 
exhibit at least 4-fold altered expression in metastatic colon 
cancer. Additionally, their reliance on lower-resolution chromo- 
somal CGH may have resulted in poorly delimiting the bound- 
aries of high-complexity amplicons, effectively overcalling re- 
gions with amplification. Alternatively, the contrasting findings 
for amplified genes may. represent real biological differences 
between breast and metastatic colon tumors; resolution of this 
issue will require further studies. 

Our finding that widespread DNA copy number alteration has 
a large, pervasive and direct effect on global pnT expression 
patterns in breast cancer has several important implications. 
First, this finding supports a high degree of copy number- 
dependent gene expression in tumors. Second, it suggests that 
most genes are not subject to specific autoregulation or dosage 
compensation. Third, this finding cautions that elevated expres- 
sion of an amplified gene cannot alone be considered strong 
independent evidence of a candidate oncogene's role in tumor- 
igenesis. In our study, fully 62% of highly amplified genes 
demonstrated moderately or highly elevated expression. Hiis 
highlights the importance of high-resolution mapping of ampH- 
con boundaries and shape {to identify the "driving" gene(s) 
within amplicons (16)], on a large number of samplesvln addition 
to functional studies. Fourth, this finding suggests that analyzing 
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the genomic distribution oif caressed genes, even within existing 
microarray gene expression data sets, may permit the inference 
of DNA copy number aberration, particularly aneuploldy (where 
gene expression can be averaged across large chromosomal 
regions; see Fig. 3 and supporting information). Fifth, this 
finding implies that a substantial portion of the phenotyptc 
uniqueness (and by extension, the heterogeneity in clinical 
behavior) among patients' tumors may be traceable to underly- 
ing variation in DNA copy number. Sixth, this finding supports 
a possible role for widespread DNA copy number alteration in 
fumorigenesis.(17, 18), beyond the amplification of specific 
oncogenes. and deletion of specific tumor suppressor genes. 
Widespread DNA copy number alteration, and the concomitant 
widespread irobalafl^ 

stochioraetric relationships in cell metabolism and physiology 
(e.g., prpteosome, mitotic spindle), possibly promoting further 
chromosomal instability and directly contributing to tumor 
development or progression. Finally, our findings suggest the 
possibility of cancer therapies that exploit specific or global 
imbalances in gene expression in cancer. 
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Figure 6-3 Genes can be expressed 
with different efficiencies. Gene A i s 
transcribed and translated much more 
efficiently than gene B.This allows the 
amount of protein A in the cell to be 
much greater than that of protein B. 
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FROM DNA TO RNA 

Transcription and translation are the means by which cells read out, or express, 
the genetic instructions in their genes. Because many identical RNA copies can 
be made from the same gene, and each RNA molecule can direct the synthesis 
of many identical protein molecules, cells can synthesize a large amount of 
protein rapidly when necessary. But each gene can also be transcribed and 
translated with a different efficiency, allowing the cell to make vast quantities of 
some proteins and tiny quantities of others (Figure 6-3). Moreover, as we see in 
the next chapter, a cell can change (or regulate) the expression of each of its 
genes according to the needs of the moment — most obviously by controlling 
the production of its RNA. 



Portions of DNA Sequence Are Transcribed into RNA 

The first step a cell takes in reading out a needed part of its genetic instructions 
is to copy a particular portion of its DNA nucleotide sequence — a gene— into an 
RNA nucleotide sequence. The information in RNA, although copied into another 
chemical form, is still written in essentially the same language as it is in DNA — 
the language of a nucleotide sequence. Hence the name transcription. 

like DNA, RNA is a linear polymer made of four different types of nucleotide 
subunits linked together by phosphodiester bonds (Figure (M). It differs from 
DNA chemically in two* respects: (1) the nucleotides in RNA are 
ribonucleotides~-ihat is, they contain the sugar ribose (hence the name ribonu- 
cleic acid) rather than deojcyribose; (2) although, like DNA, RNA contains the 
bases adenine (A), guanine (G), and cytosine (C), it contains the base uracil (U) 
instead of the thymine (T) in DNA. Since U, like T, can base-pair by hydrogen- 
bonding with A (Figure 6-5), the complementary base-pairing properties 
described for DNA in Chapters 4 and 5 apply also to RNA (in RNA, G pairs with 
C, and A pairs with U). It is not uncommon, however, to find other types of base 
pairs in RNA: for example, G pairing with U occasionally. 

Despite these small chemical differences, DNA and RNA differ quite dra- 
matically in overall structure. Whereas DNA always occurs in cells as a double- 
stranded helix, RNA is single-stranded. RNA chains therefore fold up into a 
variety of shapes, just as a polypeptide chain folds up to form the final shape of 
a protein (Figure 6-43). As we see later in this chapter, the ability to fold into com- 
plex three-dimensional shapes allows some RNA molecules to have structural 
and catalytic functions. 



Transcription Produces RNA Complementary to 
One Strand of DNA 

All of the RNA in a cell is made by DNA transcription, a process that has cer- 
tain similarities to the process of DNA replication discussed in Chapter 5. 
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Figure 6-89 Protein aggregates that cause human disease. (A) Schematic illustration of the type of 
conformational change in a protein that produces material for a cross-beta filament. (B) Diagram illustrating 
the self-infectious nature of the protein aggregation that is central to prion diseases. PrP is highly unusual 
because the misfolded version of the protein, called PrP*, induces the nbrmal PrP protein it contacts to 
change Its conformation, as shown. Most of the human diseases caused by protein aggregation are caused by 
the overproduction of a variant protein that is especially prone to aggregation, but because this structure is 
not infectious in this way, it cannot spread from one animal to another. (Q Drawing of a cross-beta filament, 
a common type of protease-resistant protein aggregate found in a variety of human neurological diseases. 
Because the hydrogen-bond interactions in a p sheet form between polypeptide backbone atoms (see Figure 
3-9), a number of different abnormally folded proteins can produce this structure. (D) One of several 
possible models for the conversion of PrP to PrP*, showing the likely change of two a-helices into four 
p-sqrands. Although the structure of the normal protein has been determined accurately, the structure of the 
infectious form is not yet known with certainty because the aggregation has prevented the use of standard 
structural techniques. (C, courtesy of Louise Serpell, adapted from M. Sunde et al.J. Mol. Biol 273:729-739. 
1997; D, adapted from S.B. Prusiner, Trends Bfochem. Set. 21:482-487, 1996.) 

animals and humans. It can be dangerous to eat the tissues of animals that con- 
tain PrP*, as witnessed most recently by the spread of BSE (commonly referred 
to as the "mad cow disease") from cattle to humans in Great Britain. 

Fortunately in the absence of PrP*, PrP is extraordinarily difficult to convert 
to its abnormal form. Although very few proteins have the potential to misfold 
into an infectious conformation* a similar transformation has been discovered 
to be the cause of an otherwise mysterious "protein-only inheritance" observed 
in yeast cells. 

There Are Many Steps From DN A to Protein 

We have seen so far in this chapter that many different types of chemical reac- 
tions are required to produce a properly folded protein from the information 
contained in a gene (Figure 6-90). The final level of a properly folded protein in 
a cell therefore depends upon the efficiency with which each of the many steps 
is performed. 

We discuss in Chapter 7 that cells have the ability to change the levels of 
their proteins according to their needs. In principle, any or all of the steps in Fig- 
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COMPLETION OF PROTEIN SYNTHESIS 
AND PROTEIN FOLDING 
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Figure 6-90 The production of a 
protein by a eucaryotic ceil. The final 
level of each protein In a eucaryotic cell 
depends upon the efficiency of each step 
depicted. 
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ure 6-90) could be regulated by the cell for each individual protein. However, as 
we shall see in Chapter 7, the initiation of transcription is the most common 
point for a cell to regulate the expression of each of its genes. This makes sense, 
inasmuch as the most efficient way to keep a gene from being expressed is to 
block the very first step— the transcription of its DNA sequence into an RNA 
molecule. 



Summary 

The translation of the nucleotide sequence of an mRNA molecule into protein takes 
place in the cytoplasm on a large ribonucleoprotein assembly called a ribosome. The 
amino acids used for protein synthesis are first attached to a family of tRNA 
molecules, each of which recognizes, by complementary base-pair interactions, par- 
ticular sets of three nucleotides in the mRNA (codons). The sequence of nucleotides in 
the mRNA is then read from one end to the other in sets of three according to the 
genetic code. 

To initiate translation, a small ribosomal subunit binds to the mRNA molecule 
at a start codon (AUG) that is recognized by a unique initiator tRNA molecule. A 
large ribosomal subunit binds to complete the ribosome and begin the elongation 
phase of protein synthesis. During this phase, aminoacyl iRNAs—each bearing a 
specific amino acid bind sequentially to the appropriate codon in mRNA by forming 
complementary base pairs with the tRNA anticodon. Each amino acid is added to the 
C-terminal end of the growing polypeptide by means of a cycle of three sequential 



364 Chapter 6 : HOW CELLS READ THE GENOME: FROM DNA TO PROTEIN 



Nuaeus 



CYTOSOL 



mRNA 

degradation 5 
control 




Figure 7-5 Six steps at which 
eucaryotic gene expression can be 
controlled. Controls that operate at 
steps I through 5 are discussed ip this 
chapter. Step 6, the regulation of protein 
activity, includes reversible activation or 
inactivation by protein phosphorylation 
(discussed In Chapter 3) as well as 
Irreversible inactivation by proteolytic 
degradation (discussed in Chapter 6). 



Gene Expression Can Be Regulated at Many of the Steps 
in the Pathway from DNA to RNA to Protein 

If differences among the various cell types of an organism depend on the partic- 
ular genes that the cells express, at what level is the control of gene expression 
exercised? As we saw in the last chapter, there are many steps in the pathway 
leading from DNA to protein, and all of them can in principle be regulated Thus 
a cell can control the proteins it makes by (1) controlling when and how often a 
given gene is transcribed (transcriptional control), (2) controlling how the RNA 
transcript is spliced or otherwise processed (RNA processing control}, (3) 
selecting which completed mRNAs in the cell nucleus are exported to the cytosol 
and determining where in the cytosol they are localized (RNA transport and 
localization control), (4) selecting which mRNAs in the cytoplasm are translated 
by ribosomes (translation^ control), (5) selectively destabilizing certain mRNA 
molecules in the cytoplasm (mRNA degradation control), or (6) selectively acti- 
vating, inactivating, degrading, or compartmentalizing specific protein 
molecules after they have been made (protein activity control) (Figure 7-5). 

For most genes transcriptional controls are paramount. This makes sense 
because, of all the possible control points illustrated in Figure 7-5, only tran- 
scriptional control ensures that the cell will not synthesize superfluous interme- 
diates. In the following sections we discuss the DNA and protein components 
that perform this function by regulating the initiation of gene transcription. We 
shall return at the end of the chapter to the additional ways of regulating gene 
expression. 

Summary 

The genome of a cell contains in its DNA sequence the information to make many 
thousands of different protein and RNA molecules. A cell typically expresses only a 
fraction of its genes, and the different types of cells in multicellular organisms arise 
because different sets of genes are expressed. Moreover, cells can change the pattern 
of genes they express in response to changes in their environment, such as signals 
from other cells. Although all of the steps involved in expressing a gene can in prin- 
ciple be regulated, for most genes the initiation of RNA transcription is the most 
important point of control 



DNA-BINDING MOTIFS IN GENE REGULATORY 
PROTEINS 

How does a cell determine which of its thousands of genes to transcribe? As 
mentioned briefly in Chapters 4 and 6, the transcription of each gene is con- 
trolled by a regulatory region of DNA relatively near the site where transcription 
begins. Some regulatory regions are simple and act as switches that are thrown 
;by a single signal. Many others are complex and act as tiny microprocessors, 
.responding to a variety of signals that they interpret and integrate to switch the 
.neighboring gene on or off. Whether complex or simple, these switching devices 
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'occur in the germ line, the cell lineage that gives rise to sperm or eggs. Most of 
the DNA in vertebrate germ cells is inactive and highly methylated. Over long 
periods of evolutionary time, the methylated CG sequences in these inactive 
regions have presumably been lost through spontaneous deamination events 
that were not properly repaired. However promoters of genes that remain active 
in the germ cell lineages (including most housekeeping genes) are kept 
unmethylated, and therefore spontaneous deaminations of Cs that occur with- 
in them can be accurately repaired. Such regions are preserved in modern day 
vertebrate cells as CG islands. In addition, any mutation of a CG sequence in the 
genome that destroyed the function or regulation of a gene in the adult would be 
selected against, and some CG islands are simply the result of a higher than nor- 
mal density of critical CG sequences. 

The mammalian genome contains an estimated 20,000 CG islands. Most of 
the islands mark the 5' ends of transcription units and thus, presumably, of 
genes. The presence of CG islands often provides a convenient way of identify- 
ing genes in the DNA sequences of vertebrate genomes. 

Summary 

The many types of cells in animals and plants are created largely through mecha- 
nisms that cause different genes to be transcribed in different ceils. Since many 
specialized animal cells can maintain their unique character through many cell 
division cycles and even when grown in culture, the gene regulatory mechanisms 
involved in creating them must be stable once established and heritable when the 
cell divides. These features endow the cell with a memory of its developmental history. 
Bacteria and yeasts provide unusually accessible model systems in which to study 
gene regulatory mechanisms. One such mechanism involves a competitive interac- 
tion between two gene regulatory proteins, each of which inhibits the synthesis of the 
other; this can create a flip-flop switch that switches a cell between two alternative 
patterns of gene expression. Direct or indirect positive feedback loops, which enable 
gene regulatory proteins to perpetuate their own synthesis, provide a general mech- 
anism for cell memory. Negative feedback loops with programmed delays form the 
basis for cellular clocks. 

In eucaryotes the transcription of a gene is generally controlled by combinations 
of gene regulatory proteins. It is thought that each type of cell in a higher eucaryotic 
organism contains a specific combination of gene regulatory proteins that ensures 
the expression of only those genes appropriate to that type of cell A given gene regu- 
latory protein may be active in a variety of circumstances and typically is involved 
in the regulation of many genes. 

In addition to diffusible gene regulatory proteins, inherited states of chromatin 
condensation are also used by eucaryotic cells to regulate gene expression. An espe- 
cially dramatic case is the inactivation of an entire X chromosome in female mam- 
mals. In vertebrates DNA methylation also functions in gene regulation, being used 
mainly as a device to reinforce decisions about gene expression that are made ini- 
tially by other mechanisms* DNA methylation also underlies the phenomenon of 
genomic imprinting in mammals, in which the expression of a gene depends on 
whether it was inherited from the mother or the father. 
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Figure 7-86 A mechanism to explain 
both the marked overall deficiency 
of CG sequences and their clustering 
into CG islands in vertebrate 
genomes. A black line marks the location 
of a CG dinucleotide In the DNA 
sequence, while a red "lollipop" indicates 
the presence of a methyl group on the 
CG dinucleotide. CG sequences that lie In 
regulatory sequences of genes that are 
transcribed in germ cells are unmethylated 
and therefore tend to be retained In 
evolution. Methylated CG sequences, on 
the other hand, tend to be lost through 
deamination of 5-mechyl C toT. unless the 
CG sequence is critical for survival. 



POSTTRANSCRIPTIONAL CONTROLS 

In principle, every step required for the process of gene expression could be 
controlled. Indeed, one can find examples of each type of regulation, although 
any one gene is likely to use only a few of them. Controls on the initiation of 
gene transcription are the predominant form of regulation for most genes. But 
other controls can act later in the pathway from DNA to protein to modulate 
the amount of gene product that is made. Although these posttranscriptional 
controls, which operate after RNA polymerase has bound to the gene's promoter 
and begun RNA synthesis, are less common than transcriptional control, for 
many genes they are crucial. 
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